Transformation of a given pandas dataframe to another dataframe
I have a pandas dataframe like this below. This gives me the distances in degrees from individual points to the following cities,viz, Fargo,Orange and Jersey City. But each column in the below dataframe like 'Fargo' has its row numbers 0 through 3 populated by the shortest 4 distances to any of the points and for the rest of the 8 rows it is getting populated because we are finding out the 4 shortest distances to another city 'Orange' and so on. To summarise from the below dataframe
Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City
Lets say for the column Fargo
the first 4 rows : rows 0 through 3 represent the points which have the shortest distances from Fargo
. Similarly in the column Orange
rows 4 through 7 represent the points which have the shortest distances to Orange
and now in rows 4 through 7 the column Fargo
gets populated with the distances from the nearest four points to Orange
. But I want a frame where I get the 4 points having the shortest distances to each City in one single dataframe. So what you see here in the column Fargo
rows 0-3 are its 4 nearest points,in column Orange
, rows 4-7 are its nearest 4 points, in column Jersey City
the rows 8-11 are its 4 nearest points. I want to keep those 4 nearest points for each city and remove the remaining as I have done below.
What I want is this:
Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City
python pandas dataframe
add a comment |
I have a pandas dataframe like this below. This gives me the distances in degrees from individual points to the following cities,viz, Fargo,Orange and Jersey City. But each column in the below dataframe like 'Fargo' has its row numbers 0 through 3 populated by the shortest 4 distances to any of the points and for the rest of the 8 rows it is getting populated because we are finding out the 4 shortest distances to another city 'Orange' and so on. To summarise from the below dataframe
Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City
Lets say for the column Fargo
the first 4 rows : rows 0 through 3 represent the points which have the shortest distances from Fargo
. Similarly in the column Orange
rows 4 through 7 represent the points which have the shortest distances to Orange
and now in rows 4 through 7 the column Fargo
gets populated with the distances from the nearest four points to Orange
. But I want a frame where I get the 4 points having the shortest distances to each City in one single dataframe. So what you see here in the column Fargo
rows 0-3 are its 4 nearest points,in column Orange
, rows 4-7 are its nearest 4 points, in column Jersey City
the rows 8-11 are its 4 nearest points. I want to keep those 4 nearest points for each city and remove the remaining as I have done below.
What I want is this:
Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City
python pandas dataframe
2
Please can you better explain the problem and what you are trying to obtain.
– yatu
Nov 15 '18 at 13:30
2
name 'data' is not defined! Please provide a mcve
– user32185
Nov 15 '18 at 13:34
1
@AlexandreNixon I hope you understand the problem now.
– Sounak Banerjee
Nov 15 '18 at 13:42
@user32185 I think the 'data' you were asking is given now. Apologies for the hassle.
– Sounak Banerjee
Nov 15 '18 at 13:43
add a comment |
I have a pandas dataframe like this below. This gives me the distances in degrees from individual points to the following cities,viz, Fargo,Orange and Jersey City. But each column in the below dataframe like 'Fargo' has its row numbers 0 through 3 populated by the shortest 4 distances to any of the points and for the rest of the 8 rows it is getting populated because we are finding out the 4 shortest distances to another city 'Orange' and so on. To summarise from the below dataframe
Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City
Lets say for the column Fargo
the first 4 rows : rows 0 through 3 represent the points which have the shortest distances from Fargo
. Similarly in the column Orange
rows 4 through 7 represent the points which have the shortest distances to Orange
and now in rows 4 through 7 the column Fargo
gets populated with the distances from the nearest four points to Orange
. But I want a frame where I get the 4 points having the shortest distances to each City in one single dataframe. So what you see here in the column Fargo
rows 0-3 are its 4 nearest points,in column Orange
, rows 4-7 are its nearest 4 points, in column Jersey City
the rows 8-11 are its 4 nearest points. I want to keep those 4 nearest points for each city and remove the remaining as I have done below.
What I want is this:
Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City
python pandas dataframe
I have a pandas dataframe like this below. This gives me the distances in degrees from individual points to the following cities,viz, Fargo,Orange and Jersey City. But each column in the below dataframe like 'Fargo' has its row numbers 0 through 3 populated by the shortest 4 distances to any of the points and for the rest of the 8 rows it is getting populated because we are finding out the 4 shortest distances to another city 'Orange' and so on. To summarise from the below dataframe
Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City
Lets say for the column Fargo
the first 4 rows : rows 0 through 3 represent the points which have the shortest distances from Fargo
. Similarly in the column Orange
rows 4 through 7 represent the points which have the shortest distances to Orange
and now in rows 4 through 7 the column Fargo
gets populated with the distances from the nearest four points to Orange
. But I want a frame where I get the 4 points having the shortest distances to each City in one single dataframe. So what you see here in the column Fargo
rows 0-3 are its 4 nearest points,in column Orange
, rows 4-7 are its nearest 4 points, in column Jersey City
the rows 8-11 are its 4 nearest points. I want to keep those 4 nearest points for each city and remove the remaining as I have done below.
What I want is this:
Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City
python pandas dataframe
python pandas dataframe
edited Nov 15 '18 at 13:49
Malik Asad
316111
316111
asked Nov 15 '18 at 13:26
Sounak BanerjeeSounak Banerjee
465
465
2
Please can you better explain the problem and what you are trying to obtain.
– yatu
Nov 15 '18 at 13:30
2
name 'data' is not defined! Please provide a mcve
– user32185
Nov 15 '18 at 13:34
1
@AlexandreNixon I hope you understand the problem now.
– Sounak Banerjee
Nov 15 '18 at 13:42
@user32185 I think the 'data' you were asking is given now. Apologies for the hassle.
– Sounak Banerjee
Nov 15 '18 at 13:43
add a comment |
2
Please can you better explain the problem and what you are trying to obtain.
– yatu
Nov 15 '18 at 13:30
2
name 'data' is not defined! Please provide a mcve
– user32185
Nov 15 '18 at 13:34
1
@AlexandreNixon I hope you understand the problem now.
– Sounak Banerjee
Nov 15 '18 at 13:42
@user32185 I think the 'data' you were asking is given now. Apologies for the hassle.
– Sounak Banerjee
Nov 15 '18 at 13:43
2
2
Please can you better explain the problem and what you are trying to obtain.
– yatu
Nov 15 '18 at 13:30
Please can you better explain the problem and what you are trying to obtain.
– yatu
Nov 15 '18 at 13:30
2
2
name 'data' is not defined! Please provide a mcve
– user32185
Nov 15 '18 at 13:34
name 'data' is not defined! Please provide a mcve
– user32185
Nov 15 '18 at 13:34
1
1
@AlexandreNixon I hope you understand the problem now.
– Sounak Banerjee
Nov 15 '18 at 13:42
@AlexandreNixon I hope you understand the problem now.
– Sounak Banerjee
Nov 15 '18 at 13:42
@user32185 I think the 'data' you were asking is given now. Apologies for the hassle.
– Sounak Banerjee
Nov 15 '18 at 13:43
@user32185 I think the 'data' you were asking is given now. Apologies for the hassle.
– Sounak Banerjee
Nov 15 '18 at 13:43
add a comment |
3 Answers
3
active
oldest
votes
What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:
newdf=np.empty([12])
for i in range(12):
newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
newdf1=
cities=list(data.columns.values[1:])
for i in range(12):
newdf1.append(cities[(math.ceil((i+1)/4)-1)])
strs = ["" for x in range(12)]
for i in range(12):
strs[i]=data.iloc[i,0]
final_data=pd.DataFrame(columns=['city','point','distance' ])
final_data['city']=newdf1
final_data['distance']=newdf
final_data['point']=strs
add a comment |
You can use the following:
intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
for i, j in zip(range(toy_data.shape[1]), intervals):
df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]
print(df)
Distances
0 2.90301
1 3.91961
2 21.9826
3 24.3141
4 4.80215
5 6.17298
6 25.5464
7 27.1528
8 2.09632
9 2.67885
10 19.6763
11 21.103
add a comment |
You can use np.split()
and a for loop:
x = 0
split =
for num in range(len(toy_data.columns)-1):
split.append(x+4)
x+=4
dfs = np.split(toy_data, split)
data =
for i in range(len(dfs)):
data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
pd.concat(data, sort=False)
Fargo Orange Jersey_City
Point1 2.903008 NaN NaN
Point4 3.919613 NaN NaN
Point5 21.982559 NaN NaN
Point2 24.314142 NaN NaN
Point2 NaN 4.802149 NaN
Point5 NaN 6.172984 NaN
Point1 NaN 25.546446 NaN
Point4 NaN 27.152798 NaN
Point3 NaN NaN 2.096323
Point6 NaN NaN 2.678850
Point4 NaN NaN 19.676339
Point1 NaN NaN 21.103042
TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..
– Saradamani
Nov 15 '18 at 14:25
@Saradamani what version of pandas are you using?
– Chris
Nov 15 '18 at 14:27
pd.__version__ Out[924]: '0.21.1'
– Saradamani
Nov 15 '18 at 14:28
1
@Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param
– Chris
Nov 15 '18 at 14:29
Beautiful answer Yes I saw this works. My solution was ofcourse different..
– Saradamani
Nov 15 '18 at 14:33
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53320534%2ftransformation-of-a-given-pandas-dataframe-to-another-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:
newdf=np.empty([12])
for i in range(12):
newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
newdf1=
cities=list(data.columns.values[1:])
for i in range(12):
newdf1.append(cities[(math.ceil((i+1)/4)-1)])
strs = ["" for x in range(12)]
for i in range(12):
strs[i]=data.iloc[i,0]
final_data=pd.DataFrame(columns=['city','point','distance' ])
final_data['city']=newdf1
final_data['distance']=newdf
final_data['point']=strs
add a comment |
What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:
newdf=np.empty([12])
for i in range(12):
newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
newdf1=
cities=list(data.columns.values[1:])
for i in range(12):
newdf1.append(cities[(math.ceil((i+1)/4)-1)])
strs = ["" for x in range(12)]
for i in range(12):
strs[i]=data.iloc[i,0]
final_data=pd.DataFrame(columns=['city','point','distance' ])
final_data['city']=newdf1
final_data['distance']=newdf
final_data['point']=strs
add a comment |
What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:
newdf=np.empty([12])
for i in range(12):
newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
newdf1=
cities=list(data.columns.values[1:])
for i in range(12):
newdf1.append(cities[(math.ceil((i+1)/4)-1)])
strs = ["" for x in range(12)]
for i in range(12):
strs[i]=data.iloc[i,0]
final_data=pd.DataFrame(columns=['city','point','distance' ])
final_data['city']=newdf1
final_data['distance']=newdf
final_data['point']=strs
What you can do is not exactly that what I guess you wanted but I think this will solve the purpose:
newdf=np.empty([12])
for i in range(12):
newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
newdf1=
cities=list(data.columns.values[1:])
for i in range(12):
newdf1.append(cities[(math.ceil((i+1)/4)-1)])
strs = ["" for x in range(12)]
for i in range(12):
strs[i]=data.iloc[i,0]
final_data=pd.DataFrame(columns=['city','point','distance' ])
final_data['city']=newdf1
final_data['distance']=newdf
final_data['point']=strs
edited Nov 15 '18 at 14:22
answered Nov 15 '18 at 13:56
SaradamaniSaradamani
155212
155212
add a comment |
add a comment |
You can use the following:
intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
for i, j in zip(range(toy_data.shape[1]), intervals):
df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]
print(df)
Distances
0 2.90301
1 3.91961
2 21.9826
3 24.3141
4 4.80215
5 6.17298
6 25.5464
7 27.1528
8 2.09632
9 2.67885
10 19.6763
11 21.103
add a comment |
You can use the following:
intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
for i, j in zip(range(toy_data.shape[1]), intervals):
df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]
print(df)
Distances
0 2.90301
1 3.91961
2 21.9826
3 24.3141
4 4.80215
5 6.17298
6 25.5464
7 27.1528
8 2.09632
9 2.67885
10 19.6763
11 21.103
add a comment |
You can use the following:
intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
for i, j in zip(range(toy_data.shape[1]), intervals):
df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]
print(df)
Distances
0 2.90301
1 3.91961
2 21.9826
3 24.3141
4 4.80215
5 6.17298
6 25.5464
7 27.1528
8 2.09632
9 2.67885
10 19.6763
11 21.103
You can use the following:
intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
for i, j in zip(range(toy_data.shape[1]), intervals):
df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]
print(df)
Distances
0 2.90301
1 3.91961
2 21.9826
3 24.3141
4 4.80215
5 6.17298
6 25.5464
7 27.1528
8 2.09632
9 2.67885
10 19.6763
11 21.103
answered Nov 15 '18 at 14:19
yatuyatu
12.7k31341
12.7k31341
add a comment |
add a comment |
You can use np.split()
and a for loop:
x = 0
split =
for num in range(len(toy_data.columns)-1):
split.append(x+4)
x+=4
dfs = np.split(toy_data, split)
data =
for i in range(len(dfs)):
data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
pd.concat(data, sort=False)
Fargo Orange Jersey_City
Point1 2.903008 NaN NaN
Point4 3.919613 NaN NaN
Point5 21.982559 NaN NaN
Point2 24.314142 NaN NaN
Point2 NaN 4.802149 NaN
Point5 NaN 6.172984 NaN
Point1 NaN 25.546446 NaN
Point4 NaN 27.152798 NaN
Point3 NaN NaN 2.096323
Point6 NaN NaN 2.678850
Point4 NaN NaN 19.676339
Point1 NaN NaN 21.103042
TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..
– Saradamani
Nov 15 '18 at 14:25
@Saradamani what version of pandas are you using?
– Chris
Nov 15 '18 at 14:27
pd.__version__ Out[924]: '0.21.1'
– Saradamani
Nov 15 '18 at 14:28
1
@Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param
– Chris
Nov 15 '18 at 14:29
Beautiful answer Yes I saw this works. My solution was ofcourse different..
– Saradamani
Nov 15 '18 at 14:33
add a comment |
You can use np.split()
and a for loop:
x = 0
split =
for num in range(len(toy_data.columns)-1):
split.append(x+4)
x+=4
dfs = np.split(toy_data, split)
data =
for i in range(len(dfs)):
data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
pd.concat(data, sort=False)
Fargo Orange Jersey_City
Point1 2.903008 NaN NaN
Point4 3.919613 NaN NaN
Point5 21.982559 NaN NaN
Point2 24.314142 NaN NaN
Point2 NaN 4.802149 NaN
Point5 NaN 6.172984 NaN
Point1 NaN 25.546446 NaN
Point4 NaN 27.152798 NaN
Point3 NaN NaN 2.096323
Point6 NaN NaN 2.678850
Point4 NaN NaN 19.676339
Point1 NaN NaN 21.103042
TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..
– Saradamani
Nov 15 '18 at 14:25
@Saradamani what version of pandas are you using?
– Chris
Nov 15 '18 at 14:27
pd.__version__ Out[924]: '0.21.1'
– Saradamani
Nov 15 '18 at 14:28
1
@Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param
– Chris
Nov 15 '18 at 14:29
Beautiful answer Yes I saw this works. My solution was ofcourse different..
– Saradamani
Nov 15 '18 at 14:33
add a comment |
You can use np.split()
and a for loop:
x = 0
split =
for num in range(len(toy_data.columns)-1):
split.append(x+4)
x+=4
dfs = np.split(toy_data, split)
data =
for i in range(len(dfs)):
data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
pd.concat(data, sort=False)
Fargo Orange Jersey_City
Point1 2.903008 NaN NaN
Point4 3.919613 NaN NaN
Point5 21.982559 NaN NaN
Point2 24.314142 NaN NaN
Point2 NaN 4.802149 NaN
Point5 NaN 6.172984 NaN
Point1 NaN 25.546446 NaN
Point4 NaN 27.152798 NaN
Point3 NaN NaN 2.096323
Point6 NaN NaN 2.678850
Point4 NaN NaN 19.676339
Point1 NaN NaN 21.103042
You can use np.split()
and a for loop:
x = 0
split =
for num in range(len(toy_data.columns)-1):
split.append(x+4)
x+=4
dfs = np.split(toy_data, split)
data =
for i in range(len(dfs)):
data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
pd.concat(data, sort=False)
Fargo Orange Jersey_City
Point1 2.903008 NaN NaN
Point4 3.919613 NaN NaN
Point5 21.982559 NaN NaN
Point2 24.314142 NaN NaN
Point2 NaN 4.802149 NaN
Point5 NaN 6.172984 NaN
Point1 NaN 25.546446 NaN
Point4 NaN 27.152798 NaN
Point3 NaN NaN 2.096323
Point6 NaN NaN 2.678850
Point4 NaN NaN 19.676339
Point1 NaN NaN 21.103042
edited Nov 15 '18 at 14:46
answered Nov 15 '18 at 14:16
ChrisChris
2,9782420
2,9782420
TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..
– Saradamani
Nov 15 '18 at 14:25
@Saradamani what version of pandas are you using?
– Chris
Nov 15 '18 at 14:27
pd.__version__ Out[924]: '0.21.1'
– Saradamani
Nov 15 '18 at 14:28
1
@Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param
– Chris
Nov 15 '18 at 14:29
Beautiful answer Yes I saw this works. My solution was ofcourse different..
– Saradamani
Nov 15 '18 at 14:33
add a comment |
TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..
– Saradamani
Nov 15 '18 at 14:25
@Saradamani what version of pandas are you using?
– Chris
Nov 15 '18 at 14:27
pd.__version__ Out[924]: '0.21.1'
– Saradamani
Nov 15 '18 at 14:28
1
@Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param
– Chris
Nov 15 '18 at 14:29
Beautiful answer Yes I saw this works. My solution was ofcourse different..
– Saradamani
Nov 15 '18 at 14:33
TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..
– Saradamani
Nov 15 '18 at 14:25
TypeError: concat() got an unexpected keyword argument 'sort'. It shows this error..
– Saradamani
Nov 15 '18 at 14:25
@Saradamani what version of pandas are you using?
– Chris
Nov 15 '18 at 14:27
@Saradamani what version of pandas are you using?
– Chris
Nov 15 '18 at 14:27
pd.__version__ Out[924]: '0.21.1'
– Saradamani
Nov 15 '18 at 14:28
pd.__version__ Out[924]: '0.21.1'
– Saradamani
Nov 15 '18 at 14:28
1
1
@Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param
– Chris
Nov 15 '18 at 14:29
@Saradamani Sort was added in 0.23 I believe. you should just be able to remove that param
– Chris
Nov 15 '18 at 14:29
Beautiful answer Yes I saw this works. My solution was ofcourse different..
– Saradamani
Nov 15 '18 at 14:33
Beautiful answer Yes I saw this works. My solution was ofcourse different..
– Saradamani
Nov 15 '18 at 14:33
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53320534%2ftransformation-of-a-given-pandas-dataframe-to-another-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Please can you better explain the problem and what you are trying to obtain.
– yatu
Nov 15 '18 at 13:30
2
name 'data' is not defined! Please provide a mcve
– user32185
Nov 15 '18 at 13:34
1
@AlexandreNixon I hope you understand the problem now.
– Sounak Banerjee
Nov 15 '18 at 13:42
@user32185 I think the 'data' you were asking is given now. Apologies for the hassle.
– Sounak Banerjee
Nov 15 '18 at 13:43