Prevent NaN to become index and column in dataframe pivot
I have a dataframe which I extend to include values for all increments in 2 columns. Therefor NaN values are introduced, as expected and desired.
However, when I use pivot on this dataframe I'll get a row and column for NaN.
Can I prevent this when doing the pivot? If not, how can I drop a column named NaN?
Trying to drop it by calling [NaN],[nan] or ['NaN'] doesn't work.
Dropping the columns and rows where all values are NaN is not working in this case as the column headings and indexes are used for a seaborn heatmap plot, so eventhough all cell values are NaN it is still useful to have it as the index and key values are not NaN
Sample code;
import pandas as pd
import numpy as np
#generate dummy data
df = pd.DataFrame({'Y': np.random.randint(130,140,10),
'X': np.random.randint(5,10,10),
'Z': np.random.randint(0,25, size=10)})
df = df.round(1)
#create dataset for heatmap
#group by axis to plot
df = df.groupby(['X','Y']).sum().reset_index()
df = df.sort_values(by=['Y'])
dfY = pd.DataFrame({'Y':np.arange(min(df['Y']), max(df['Y']),1)})
dfX = pd.DataFrame({'X':np.arange(min(df['X']), max(df['X']),1)})
df = pd.merge(df,dfY, how='outer', on='Y')
df = pd.merge(df,dfX, how='outer', on='X')
df = df.round(1)
print(df)
#restructure for heatmap
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
print(data)
Sample DataFrame before pivot:
X Y Z
0 5.0 132.0 0.0
1 5.0 135.0 20.0
2 5.0 137.0 17.0
3 7.0 132.0 15.0
4 7.0 133.0 3.0
5 6.0 133.0 30.0
6 6.0 135.0 22.0
7 6.0 138.0 16.0
8 9.0 135.0 9.0
9 NaN 134.0 NaN
10 NaN 136.0 NaN
11 8.0 NaN NaN
After pivot:
X NaN 5.0 6.0 7.0 8.0 9.0
Y
138.0 NaN NaN 16.0 NaN NaN NaN
137.0 NaN 17.0 NaN NaN NaN NaN
136.0 NaN NaN NaN NaN NaN NaN
135.0 NaN 20.0 22.0 NaN NaN 9.0
134.0 NaN NaN NaN NaN NaN NaN
133.0 NaN NaN 30.0 3.0 NaN NaN
132.0 NaN 0.0 NaN 15.0 NaN NaN
NaN NaN NaN NaN NaN NaN NaN
Desired output:
X 5.0 6.0 7.0 8.0 9.0
Y
138.0 NaN 16.0 NaN NaN NaN
137.0 17.0 NaN NaN NaN NaN
136.0 NaN NaN NaN NaN NaN
135.0 20.0 22.0 NaN NaN 9.0
134.0 NaN NaN NaN NaN NaN
133.0 NaN 30.0 3.0 NaN NaN
132.0 0.0 NaN 15.0 NaN NaN
python pandas dataframe pivot
add a comment |
I have a dataframe which I extend to include values for all increments in 2 columns. Therefor NaN values are introduced, as expected and desired.
However, when I use pivot on this dataframe I'll get a row and column for NaN.
Can I prevent this when doing the pivot? If not, how can I drop a column named NaN?
Trying to drop it by calling [NaN],[nan] or ['NaN'] doesn't work.
Dropping the columns and rows where all values are NaN is not working in this case as the column headings and indexes are used for a seaborn heatmap plot, so eventhough all cell values are NaN it is still useful to have it as the index and key values are not NaN
Sample code;
import pandas as pd
import numpy as np
#generate dummy data
df = pd.DataFrame({'Y': np.random.randint(130,140,10),
'X': np.random.randint(5,10,10),
'Z': np.random.randint(0,25, size=10)})
df = df.round(1)
#create dataset for heatmap
#group by axis to plot
df = df.groupby(['X','Y']).sum().reset_index()
df = df.sort_values(by=['Y'])
dfY = pd.DataFrame({'Y':np.arange(min(df['Y']), max(df['Y']),1)})
dfX = pd.DataFrame({'X':np.arange(min(df['X']), max(df['X']),1)})
df = pd.merge(df,dfY, how='outer', on='Y')
df = pd.merge(df,dfX, how='outer', on='X')
df = df.round(1)
print(df)
#restructure for heatmap
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
print(data)
Sample DataFrame before pivot:
X Y Z
0 5.0 132.0 0.0
1 5.0 135.0 20.0
2 5.0 137.0 17.0
3 7.0 132.0 15.0
4 7.0 133.0 3.0
5 6.0 133.0 30.0
6 6.0 135.0 22.0
7 6.0 138.0 16.0
8 9.0 135.0 9.0
9 NaN 134.0 NaN
10 NaN 136.0 NaN
11 8.0 NaN NaN
After pivot:
X NaN 5.0 6.0 7.0 8.0 9.0
Y
138.0 NaN NaN 16.0 NaN NaN NaN
137.0 NaN 17.0 NaN NaN NaN NaN
136.0 NaN NaN NaN NaN NaN NaN
135.0 NaN 20.0 22.0 NaN NaN 9.0
134.0 NaN NaN NaN NaN NaN NaN
133.0 NaN NaN 30.0 3.0 NaN NaN
132.0 NaN 0.0 NaN 15.0 NaN NaN
NaN NaN NaN NaN NaN NaN NaN
Desired output:
X 5.0 6.0 7.0 8.0 9.0
Y
138.0 NaN 16.0 NaN NaN NaN
137.0 17.0 NaN NaN NaN NaN
136.0 NaN NaN NaN NaN NaN
135.0 20.0 22.0 NaN NaN 9.0
134.0 NaN NaN NaN NaN NaN
133.0 NaN 30.0 3.0 NaN NaN
132.0 0.0 NaN 15.0 NaN NaN
python pandas dataframe pivot
add a comment |
I have a dataframe which I extend to include values for all increments in 2 columns. Therefor NaN values are introduced, as expected and desired.
However, when I use pivot on this dataframe I'll get a row and column for NaN.
Can I prevent this when doing the pivot? If not, how can I drop a column named NaN?
Trying to drop it by calling [NaN],[nan] or ['NaN'] doesn't work.
Dropping the columns and rows where all values are NaN is not working in this case as the column headings and indexes are used for a seaborn heatmap plot, so eventhough all cell values are NaN it is still useful to have it as the index and key values are not NaN
Sample code;
import pandas as pd
import numpy as np
#generate dummy data
df = pd.DataFrame({'Y': np.random.randint(130,140,10),
'X': np.random.randint(5,10,10),
'Z': np.random.randint(0,25, size=10)})
df = df.round(1)
#create dataset for heatmap
#group by axis to plot
df = df.groupby(['X','Y']).sum().reset_index()
df = df.sort_values(by=['Y'])
dfY = pd.DataFrame({'Y':np.arange(min(df['Y']), max(df['Y']),1)})
dfX = pd.DataFrame({'X':np.arange(min(df['X']), max(df['X']),1)})
df = pd.merge(df,dfY, how='outer', on='Y')
df = pd.merge(df,dfX, how='outer', on='X')
df = df.round(1)
print(df)
#restructure for heatmap
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
print(data)
Sample DataFrame before pivot:
X Y Z
0 5.0 132.0 0.0
1 5.0 135.0 20.0
2 5.0 137.0 17.0
3 7.0 132.0 15.0
4 7.0 133.0 3.0
5 6.0 133.0 30.0
6 6.0 135.0 22.0
7 6.0 138.0 16.0
8 9.0 135.0 9.0
9 NaN 134.0 NaN
10 NaN 136.0 NaN
11 8.0 NaN NaN
After pivot:
X NaN 5.0 6.0 7.0 8.0 9.0
Y
138.0 NaN NaN 16.0 NaN NaN NaN
137.0 NaN 17.0 NaN NaN NaN NaN
136.0 NaN NaN NaN NaN NaN NaN
135.0 NaN 20.0 22.0 NaN NaN 9.0
134.0 NaN NaN NaN NaN NaN NaN
133.0 NaN NaN 30.0 3.0 NaN NaN
132.0 NaN 0.0 NaN 15.0 NaN NaN
NaN NaN NaN NaN NaN NaN NaN
Desired output:
X 5.0 6.0 7.0 8.0 9.0
Y
138.0 NaN 16.0 NaN NaN NaN
137.0 17.0 NaN NaN NaN NaN
136.0 NaN NaN NaN NaN NaN
135.0 20.0 22.0 NaN NaN 9.0
134.0 NaN NaN NaN NaN NaN
133.0 NaN 30.0 3.0 NaN NaN
132.0 0.0 NaN 15.0 NaN NaN
python pandas dataframe pivot
I have a dataframe which I extend to include values for all increments in 2 columns. Therefor NaN values are introduced, as expected and desired.
However, when I use pivot on this dataframe I'll get a row and column for NaN.
Can I prevent this when doing the pivot? If not, how can I drop a column named NaN?
Trying to drop it by calling [NaN],[nan] or ['NaN'] doesn't work.
Dropping the columns and rows where all values are NaN is not working in this case as the column headings and indexes are used for a seaborn heatmap plot, so eventhough all cell values are NaN it is still useful to have it as the index and key values are not NaN
Sample code;
import pandas as pd
import numpy as np
#generate dummy data
df = pd.DataFrame({'Y': np.random.randint(130,140,10),
'X': np.random.randint(5,10,10),
'Z': np.random.randint(0,25, size=10)})
df = df.round(1)
#create dataset for heatmap
#group by axis to plot
df = df.groupby(['X','Y']).sum().reset_index()
df = df.sort_values(by=['Y'])
dfY = pd.DataFrame({'Y':np.arange(min(df['Y']), max(df['Y']),1)})
dfX = pd.DataFrame({'X':np.arange(min(df['X']), max(df['X']),1)})
df = pd.merge(df,dfY, how='outer', on='Y')
df = pd.merge(df,dfX, how='outer', on='X')
df = df.round(1)
print(df)
#restructure for heatmap
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
print(data)
Sample DataFrame before pivot:
X Y Z
0 5.0 132.0 0.0
1 5.0 135.0 20.0
2 5.0 137.0 17.0
3 7.0 132.0 15.0
4 7.0 133.0 3.0
5 6.0 133.0 30.0
6 6.0 135.0 22.0
7 6.0 138.0 16.0
8 9.0 135.0 9.0
9 NaN 134.0 NaN
10 NaN 136.0 NaN
11 8.0 NaN NaN
After pivot:
X NaN 5.0 6.0 7.0 8.0 9.0
Y
138.0 NaN NaN 16.0 NaN NaN NaN
137.0 NaN 17.0 NaN NaN NaN NaN
136.0 NaN NaN NaN NaN NaN NaN
135.0 NaN 20.0 22.0 NaN NaN 9.0
134.0 NaN NaN NaN NaN NaN NaN
133.0 NaN NaN 30.0 3.0 NaN NaN
132.0 NaN 0.0 NaN 15.0 NaN NaN
NaN NaN NaN NaN NaN NaN NaN
Desired output:
X 5.0 6.0 7.0 8.0 9.0
Y
138.0 NaN 16.0 NaN NaN NaN
137.0 17.0 NaN NaN NaN NaN
136.0 NaN NaN NaN NaN NaN
135.0 20.0 22.0 NaN NaN 9.0
134.0 NaN NaN NaN NaN NaN
133.0 NaN 30.0 3.0 NaN NaN
132.0 0.0 NaN 15.0 NaN NaN
python pandas dataframe pivot
python pandas dataframe pivot
edited Nov 14 '18 at 13:00
Chrisvdberge
asked Nov 14 '18 at 12:27
ChrisvdbergeChrisvdberge
5602822
5602822
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
For me working drop
by missing value np.nan
:
data = (df.pivot("Y","X","Z")
.sort_values(by=['Y'],ascending=False)
.drop(np.nan, axis=1)
.drop(np.nan))
Or:
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
data = data.reindex(index=data.index.difference([np.nan]),
columns=data.columns.difference([np.nan]))
1
excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output
– Chrisvdberge
Nov 14 '18 at 12:33
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53300244%2fprevent-nan-to-become-index-and-column-in-dataframe-pivot%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
For me working drop
by missing value np.nan
:
data = (df.pivot("Y","X","Z")
.sort_values(by=['Y'],ascending=False)
.drop(np.nan, axis=1)
.drop(np.nan))
Or:
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
data = data.reindex(index=data.index.difference([np.nan]),
columns=data.columns.difference([np.nan]))
1
excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output
– Chrisvdberge
Nov 14 '18 at 12:33
add a comment |
For me working drop
by missing value np.nan
:
data = (df.pivot("Y","X","Z")
.sort_values(by=['Y'],ascending=False)
.drop(np.nan, axis=1)
.drop(np.nan))
Or:
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
data = data.reindex(index=data.index.difference([np.nan]),
columns=data.columns.difference([np.nan]))
1
excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output
– Chrisvdberge
Nov 14 '18 at 12:33
add a comment |
For me working drop
by missing value np.nan
:
data = (df.pivot("Y","X","Z")
.sort_values(by=['Y'],ascending=False)
.drop(np.nan, axis=1)
.drop(np.nan))
Or:
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
data = data.reindex(index=data.index.difference([np.nan]),
columns=data.columns.difference([np.nan]))
For me working drop
by missing value np.nan
:
data = (df.pivot("Y","X","Z")
.sort_values(by=['Y'],ascending=False)
.drop(np.nan, axis=1)
.drop(np.nan))
Or:
data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)
data = data.reindex(index=data.index.difference([np.nan]),
columns=data.columns.difference([np.nan]))
edited Nov 14 '18 at 12:34
answered Nov 14 '18 at 12:31
jezraeljezrael
333k24274351
333k24274351
1
excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output
– Chrisvdberge
Nov 14 '18 at 12:33
add a comment |
1
excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output
– Chrisvdberge
Nov 14 '18 at 12:33
1
1
excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output
– Chrisvdberge
Nov 14 '18 at 12:33
excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output
– Chrisvdberge
Nov 14 '18 at 12:33
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53300244%2fprevent-nan-to-become-index-and-column-in-dataframe-pivot%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown