Prevent NaN to become index and column in dataframe pivot

I have a dataframe which I extend to include values for all increments in 2 columns. Therefor NaN values are introduced, as expected and desired.

However, when I use pivot on this dataframe I'll get a row and column for NaN.
Can I prevent this when doing the pivot? If not, how can I drop a column named NaN?
Trying to drop it by calling [NaN],[nan] or ['NaN'] doesn't work.

Dropping the columns and rows where all values are NaN is not working in this case as the column headings and indexes are used for a seaborn heatmap plot, so eventhough all cell values are NaN it is still useful to have it as the index and key values are not NaN

Sample code;

import pandas as pd

import numpy as np



#generate dummy data

df = pd.DataFrame({'Y': np.random.randint(130,140,10),

                   'X': np.random.randint(5,10,10),

                   'Z': np.random.randint(0,25, size=10)})

df = df.round(1)

#create dataset for heatmap

#group by axis to plot

df = df.groupby(['X','Y']).sum().reset_index()

df = df.sort_values(by=['Y'])

dfY = pd.DataFrame({'Y':np.arange(min(df['Y']), max(df['Y']),1)})

dfX = pd.DataFrame({'X':np.arange(min(df['X']), max(df['X']),1)})

df = pd.merge(df,dfY, how='outer', on='Y')

df = pd.merge(df,dfX, how='outer', on='X')

df = df.round(1)



print(df)

#restructure for heatmap

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)

print(data)

Sample DataFrame before pivot:

      X      Y     Z

0   5.0  132.0   0.0

1   5.0  135.0  20.0

2   5.0  137.0  17.0

3   7.0  132.0  15.0

4   7.0  133.0   3.0

5   6.0  133.0  30.0

6   6.0  135.0  22.0

7   6.0  138.0  16.0

8   9.0  135.0   9.0

9   NaN  134.0   NaN

10  NaN  136.0   NaN

11  8.0    NaN   NaN

After pivot:

X       NaN    5.0   6.0   7.0   8.0   9.0

Y                                         

 138.0   NaN   NaN  16.0   NaN   NaN   NaN

 137.0   NaN  17.0   NaN   NaN   NaN   NaN

 136.0   NaN   NaN   NaN   NaN   NaN   NaN

 135.0   NaN  20.0  22.0   NaN   NaN   9.0

 134.0   NaN   NaN   NaN   NaN   NaN   NaN

 133.0   NaN   NaN  30.0   3.0   NaN   NaN

 132.0   NaN   0.0   NaN  15.0   NaN   NaN

NaN      NaN   NaN   NaN   NaN   NaN   NaN

Desired output:

X        5.0   6.0   7.0   8.0   9.0

Y                                         

 138.0   NaN  16.0   NaN   NaN   NaN

 137.0  17.0   NaN   NaN   NaN   NaN

 136.0   NaN   NaN   NaN   NaN   NaN

 135.0  20.0  22.0   NaN   NaN   9.0

 134.0   NaN   NaN   NaN   NaN   NaN

 133.0   NaN  30.0   3.0   NaN   NaN

 132.0   0.0   NaN  15.0   NaN   NaN

edited Nov 14 '18 at 13:00

asked Nov 14 '18 at 12:27

Chrisvdberge

5602822

add a comment |

I have a dataframe which I extend to include values for all increments in 2 columns. Therefor NaN values are introduced, as expected and desired.

Sample code;

import pandas as pd

import numpy as np



#generate dummy data

df = pd.DataFrame({'Y': np.random.randint(130,140,10),

                   'X': np.random.randint(5,10,10),

                   'Z': np.random.randint(0,25, size=10)})

df = df.round(1)

#create dataset for heatmap

#group by axis to plot

df = df.groupby(['X','Y']).sum().reset_index()

df = df.sort_values(by=['Y'])

dfY = pd.DataFrame({'Y':np.arange(min(df['Y']), max(df['Y']),1)})

dfX = pd.DataFrame({'X':np.arange(min(df['X']), max(df['X']),1)})

df = pd.merge(df,dfY, how='outer', on='Y')

df = pd.merge(df,dfX, how='outer', on='X')

df = df.round(1)



print(df)

#restructure for heatmap

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)

print(data)

Sample DataFrame before pivot:

      X      Y     Z

0   5.0  132.0   0.0

1   5.0  135.0  20.0

2   5.0  137.0  17.0

3   7.0  132.0  15.0

4   7.0  133.0   3.0

5   6.0  133.0  30.0

6   6.0  135.0  22.0

7   6.0  138.0  16.0

8   9.0  135.0   9.0

9   NaN  134.0   NaN

10  NaN  136.0   NaN

11  8.0    NaN   NaN

After pivot:

X       NaN    5.0   6.0   7.0   8.0   9.0

Y                                         

 138.0   NaN   NaN  16.0   NaN   NaN   NaN

 137.0   NaN  17.0   NaN   NaN   NaN   NaN

 136.0   NaN   NaN   NaN   NaN   NaN   NaN

 135.0   NaN  20.0  22.0   NaN   NaN   9.0

 134.0   NaN   NaN   NaN   NaN   NaN   NaN

 133.0   NaN   NaN  30.0   3.0   NaN   NaN

 132.0   NaN   0.0   NaN  15.0   NaN   NaN

NaN      NaN   NaN   NaN   NaN   NaN   NaN

Desired output:

X        5.0   6.0   7.0   8.0   9.0

Y                                         

 138.0   NaN  16.0   NaN   NaN   NaN

 137.0  17.0   NaN   NaN   NaN   NaN

 136.0   NaN   NaN   NaN   NaN   NaN

 135.0  20.0  22.0   NaN   NaN   9.0

 134.0   NaN   NaN   NaN   NaN   NaN

 133.0   NaN  30.0   3.0   NaN   NaN

 132.0   0.0   NaN  15.0   NaN   NaN

edited Nov 14 '18 at 13:00

asked Nov 14 '18 at 12:27

Chrisvdberge

5602822

add a comment |

I have a dataframe which I extend to include values for all increments in 2 columns. Therefor NaN values are introduced, as expected and desired.

Sample code;

import pandas as pd

import numpy as np



#generate dummy data

df = pd.DataFrame({'Y': np.random.randint(130,140,10),

                   'X': np.random.randint(5,10,10),

                   'Z': np.random.randint(0,25, size=10)})

df = df.round(1)

#create dataset for heatmap

#group by axis to plot

df = df.groupby(['X','Y']).sum().reset_index()

df = df.sort_values(by=['Y'])

dfY = pd.DataFrame({'Y':np.arange(min(df['Y']), max(df['Y']),1)})

dfX = pd.DataFrame({'X':np.arange(min(df['X']), max(df['X']),1)})

df = pd.merge(df,dfY, how='outer', on='Y')

df = pd.merge(df,dfX, how='outer', on='X')

df = df.round(1)



print(df)

#restructure for heatmap

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)

print(data)

Sample DataFrame before pivot:

      X      Y     Z

0   5.0  132.0   0.0

1   5.0  135.0  20.0

2   5.0  137.0  17.0

3   7.0  132.0  15.0

4   7.0  133.0   3.0

5   6.0  133.0  30.0

6   6.0  135.0  22.0

7   6.0  138.0  16.0

8   9.0  135.0   9.0

9   NaN  134.0   NaN

10  NaN  136.0   NaN

11  8.0    NaN   NaN

After pivot:

X       NaN    5.0   6.0   7.0   8.0   9.0

Y                                         

 138.0   NaN   NaN  16.0   NaN   NaN   NaN

 137.0   NaN  17.0   NaN   NaN   NaN   NaN

 136.0   NaN   NaN   NaN   NaN   NaN   NaN

 135.0   NaN  20.0  22.0   NaN   NaN   9.0

 134.0   NaN   NaN   NaN   NaN   NaN   NaN

 133.0   NaN   NaN  30.0   3.0   NaN   NaN

 132.0   NaN   0.0   NaN  15.0   NaN   NaN

NaN      NaN   NaN   NaN   NaN   NaN   NaN

Desired output:

X        5.0   6.0   7.0   8.0   9.0

Y                                         

 138.0   NaN  16.0   NaN   NaN   NaN

 137.0  17.0   NaN   NaN   NaN   NaN

 136.0   NaN   NaN   NaN   NaN   NaN

 135.0  20.0  22.0   NaN   NaN   9.0

 134.0   NaN   NaN   NaN   NaN   NaN

 133.0   NaN  30.0   3.0   NaN   NaN

 132.0   0.0   NaN  15.0   NaN   NaN

edited Nov 14 '18 at 13:00

asked Nov 14 '18 at 12:27

Chrisvdberge

5602822

I have a dataframe which I extend to include values for all increments in 2 columns. Therefor NaN values are introduced, as expected and desired.

Sample code;

import pandas as pd

import numpy as np



#generate dummy data

df = pd.DataFrame({'Y': np.random.randint(130,140,10),

                   'X': np.random.randint(5,10,10),

                   'Z': np.random.randint(0,25, size=10)})

df = df.round(1)

#create dataset for heatmap

#group by axis to plot

df = df.groupby(['X','Y']).sum().reset_index()

df = df.sort_values(by=['Y'])

dfY = pd.DataFrame({'Y':np.arange(min(df['Y']), max(df['Y']),1)})

dfX = pd.DataFrame({'X':np.arange(min(df['X']), max(df['X']),1)})

df = pd.merge(df,dfY, how='outer', on='Y')

df = pd.merge(df,dfX, how='outer', on='X')

df = df.round(1)



print(df)

#restructure for heatmap

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)

print(data)

Sample DataFrame before pivot:

      X      Y     Z

0   5.0  132.0   0.0

1   5.0  135.0  20.0

2   5.0  137.0  17.0

3   7.0  132.0  15.0

4   7.0  133.0   3.0

5   6.0  133.0  30.0

6   6.0  135.0  22.0

7   6.0  138.0  16.0

8   9.0  135.0   9.0

9   NaN  134.0   NaN

10  NaN  136.0   NaN

11  8.0    NaN   NaN

After pivot:

X       NaN    5.0   6.0   7.0   8.0   9.0

Y                                         

 138.0   NaN   NaN  16.0   NaN   NaN   NaN

 137.0   NaN  17.0   NaN   NaN   NaN   NaN

 136.0   NaN   NaN   NaN   NaN   NaN   NaN

 135.0   NaN  20.0  22.0   NaN   NaN   9.0

 134.0   NaN   NaN   NaN   NaN   NaN   NaN

 133.0   NaN   NaN  30.0   3.0   NaN   NaN

 132.0   NaN   0.0   NaN  15.0   NaN   NaN

NaN      NaN   NaN   NaN   NaN   NaN   NaN

Desired output:

X        5.0   6.0   7.0   8.0   9.0

Y                                         

 138.0   NaN  16.0   NaN   NaN   NaN

 137.0  17.0   NaN   NaN   NaN   NaN

 136.0   NaN   NaN   NaN   NaN   NaN

 135.0  20.0  22.0   NaN   NaN   9.0

 134.0   NaN   NaN   NaN   NaN   NaN

 133.0   NaN  30.0   3.0   NaN   NaN

 132.0   0.0   NaN  15.0   NaN   NaN

python pandas dataframe pivot

edited Nov 14 '18 at 13:00

asked Nov 14 '18 at 12:27

Chrisvdberge

5602822

edited Nov 14 '18 at 13:00

asked Nov 14 '18 at 12:27

Chrisvdberge

5602822

edited Nov 14 '18 at 13:00

asked Nov 14 '18 at 12:27

Chrisvdberge

5602822

asked Nov 14 '18 at 12:27

Chrisvdberge

5602822

asked Nov 14 '18 at 12:27

Chrisvdberge

5602822

add a comment |

1 Answer
1

active

oldest

votes

For me working drop by missing value np.nan:

data = (df.pivot("Y","X","Z")

         .sort_values(by=['Y'],ascending=False)

         .drop(np.nan, axis=1)

         .drop(np.nan))

Or:

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)



data = data.reindex(index=data.index.difference([np.nan]),

                    columns=data.columns.difference([np.nan]))

edited Nov 14 '18 at 12:34

answered Nov 14 '18 at 12:31

jezrael

333k24274351

1

excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output

– Chrisvdberge
Nov 14 '18 at 12:33

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53300244%2fprevent-nan-to-become-index-and-column-in-dataframe-pivot%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

For me working drop by missing value np.nan:

data = (df.pivot("Y","X","Z")

         .sort_values(by=['Y'],ascending=False)

         .drop(np.nan, axis=1)

         .drop(np.nan))

Or:

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)



data = data.reindex(index=data.index.difference([np.nan]),

                    columns=data.columns.difference([np.nan]))

edited Nov 14 '18 at 12:34

answered Nov 14 '18 at 12:31

jezrael

333k24274351

1

excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output

– Chrisvdberge
Nov 14 '18 at 12:33

add a comment |

For me working drop by missing value np.nan:

data = (df.pivot("Y","X","Z")

         .sort_values(by=['Y'],ascending=False)

         .drop(np.nan, axis=1)

         .drop(np.nan))

Or:

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)



data = data.reindex(index=data.index.difference([np.nan]),

                    columns=data.columns.difference([np.nan]))

edited Nov 14 '18 at 12:34

answered Nov 14 '18 at 12:31

jezrael

333k24274351

1

excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output

– Chrisvdberge
Nov 14 '18 at 12:33

add a comment |

For me working drop by missing value np.nan:

data = (df.pivot("Y","X","Z")

         .sort_values(by=['Y'],ascending=False)

         .drop(np.nan, axis=1)

         .drop(np.nan))

Or:

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)



data = data.reindex(index=data.index.difference([np.nan]),

                    columns=data.columns.difference([np.nan]))

edited Nov 14 '18 at 12:34

answered Nov 14 '18 at 12:31

jezrael

333k24274351

For me working drop by missing value np.nan:

data = (df.pivot("Y","X","Z")

         .sort_values(by=['Y'],ascending=False)

         .drop(np.nan, axis=1)

         .drop(np.nan))

Or:

data = df.pivot("Y","X","Z").sort_values(by=['Y'],ascending=False)



data = data.reindex(index=data.index.difference([np.nan]),

                    columns=data.columns.difference([np.nan]))

edited Nov 14 '18 at 12:34

answered Nov 14 '18 at 12:31

jezrael

333k24274351

edited Nov 14 '18 at 12:34

answered Nov 14 '18 at 12:31

jezrael

333k24274351

answered Nov 14 '18 at 12:31

jezrael

333k24274351

answered Nov 14 '18 at 12:31

jezrael

333k24274351

1

excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output

– Chrisvdberge
Nov 14 '18 at 12:33

add a comment |

1

excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output

– Chrisvdberge
Nov 14 '18 at 12:33

excellent, np.nan is the solution for the drop. Adding a drop for axis=0 as well get's me the desired output

– Chrisvdberge
Nov 14 '18 at 12:33

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vfrdtyky