How to merge several rows into one row based on a column with specific value in Pandas
I have a DataFrame like this way:
item_id revenue month year
1 10.0 01 2014
1 5.0 02 2013
1 6.0 04 2013
1 7.0 03 2013
2 2.0 01 2013
2 3.0 03 2013
3 5.0 04 2013
And I try to get the revenue of each item from January to March 2013 like following DataFrame:
item_it revenue year
1 12.0 2013
2 5.0 2013
3 0 2013
BUT, I am confused on how to implement it in Pandas. Any help would be appreciated.
python pandas dataframe pandas-groupby
add a comment |
I have a DataFrame like this way:
item_id revenue month year
1 10.0 01 2014
1 5.0 02 2013
1 6.0 04 2013
1 7.0 03 2013
2 2.0 01 2013
2 3.0 03 2013
3 5.0 04 2013
And I try to get the revenue of each item from January to March 2013 like following DataFrame:
item_it revenue year
1 12.0 2013
2 5.0 2013
3 0 2013
BUT, I am confused on how to implement it in Pandas. Any help would be appreciated.
python pandas dataframe pandas-groupby
March 2014 or March 2013?
– AkshayNevrekar
Nov 16 '18 at 10:21
Sorry, it should be March 2013 like the last DataFrame above.
– FreAk Point
Nov 16 '18 at 10:27
add a comment |
I have a DataFrame like this way:
item_id revenue month year
1 10.0 01 2014
1 5.0 02 2013
1 6.0 04 2013
1 7.0 03 2013
2 2.0 01 2013
2 3.0 03 2013
3 5.0 04 2013
And I try to get the revenue of each item from January to March 2013 like following DataFrame:
item_it revenue year
1 12.0 2013
2 5.0 2013
3 0 2013
BUT, I am confused on how to implement it in Pandas. Any help would be appreciated.
python pandas dataframe pandas-groupby
I have a DataFrame like this way:
item_id revenue month year
1 10.0 01 2014
1 5.0 02 2013
1 6.0 04 2013
1 7.0 03 2013
2 2.0 01 2013
2 3.0 03 2013
3 5.0 04 2013
And I try to get the revenue of each item from January to March 2013 like following DataFrame:
item_it revenue year
1 12.0 2013
2 5.0 2013
3 0 2013
BUT, I am confused on how to implement it in Pandas. Any help would be appreciated.
python pandas dataframe pandas-groupby
python pandas dataframe pandas-groupby
edited Nov 16 '18 at 10:28
jpp
102k2165116
102k2165116
asked Nov 16 '18 at 10:14
FreAk PointFreAk Point
658
658
March 2014 or March 2013?
– AkshayNevrekar
Nov 16 '18 at 10:21
Sorry, it should be March 2013 like the last DataFrame above.
– FreAk Point
Nov 16 '18 at 10:27
add a comment |
March 2014 or March 2013?
– AkshayNevrekar
Nov 16 '18 at 10:21
Sorry, it should be March 2013 like the last DataFrame above.
– FreAk Point
Nov 16 '18 at 10:27
March 2014 or March 2013?
– AkshayNevrekar
Nov 16 '18 at 10:21
March 2014 or March 2013?
– AkshayNevrekar
Nov 16 '18 at 10:21
Sorry, it should be March 2013 like the last DataFrame above.
– FreAk Point
Nov 16 '18 at 10:27
Sorry, it should be March 2013 like the last DataFrame above.
– FreAk Point
Nov 16 '18 at 10:27
add a comment |
2 Answers
2
active
oldest
votes
You can slice first, then groupby and reindex to include 0 values.
month_start, month_end = 1, 3
year = 2013
res = df.loc[df['month'].between(month_start, month_end) & df['year'].eq(year)]
.groupby('item_id')['revenue'].sum()
.reindex(df['item_id'].unique()).fillna(0)
.reset_index('revenue').assign(year=year)
print(res)
item_id revenue year
0 1 12.0 2013
1 2 5.0 2013
2 3 0.0 2013
add a comment |
You can use groupby first then sum method to get the desire output.
df.groupby(['year', 'item_id']).sum().reset_index().drop('month', axis=1).set_index('item_id')
year revenue
item_id
1 2013 18.0
2 2013 5.0
3 2013 5.0
1 2014 10.0
This doesn't match OP's desired output.
– jpp
Nov 16 '18 at 10:24
the desired output in question has mistake, for year2013, summing uprevenuegive18instead of12
– has
Nov 16 '18 at 10:26
Ah, so I think it's a typo (now corrected).
– jpp
Nov 16 '18 at 10:27
add a comment |
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53335691%2fhow-to-merge-several-rows-into-one-row-based-on-a-column-with-specific-value-in%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can slice first, then groupby and reindex to include 0 values.
month_start, month_end = 1, 3
year = 2013
res = df.loc[df['month'].between(month_start, month_end) & df['year'].eq(year)]
.groupby('item_id')['revenue'].sum()
.reindex(df['item_id'].unique()).fillna(0)
.reset_index('revenue').assign(year=year)
print(res)
item_id revenue year
0 1 12.0 2013
1 2 5.0 2013
2 3 0.0 2013
add a comment |
You can slice first, then groupby and reindex to include 0 values.
month_start, month_end = 1, 3
year = 2013
res = df.loc[df['month'].between(month_start, month_end) & df['year'].eq(year)]
.groupby('item_id')['revenue'].sum()
.reindex(df['item_id'].unique()).fillna(0)
.reset_index('revenue').assign(year=year)
print(res)
item_id revenue year
0 1 12.0 2013
1 2 5.0 2013
2 3 0.0 2013
add a comment |
You can slice first, then groupby and reindex to include 0 values.
month_start, month_end = 1, 3
year = 2013
res = df.loc[df['month'].between(month_start, month_end) & df['year'].eq(year)]
.groupby('item_id')['revenue'].sum()
.reindex(df['item_id'].unique()).fillna(0)
.reset_index('revenue').assign(year=year)
print(res)
item_id revenue year
0 1 12.0 2013
1 2 5.0 2013
2 3 0.0 2013
You can slice first, then groupby and reindex to include 0 values.
month_start, month_end = 1, 3
year = 2013
res = df.loc[df['month'].between(month_start, month_end) & df['year'].eq(year)]
.groupby('item_id')['revenue'].sum()
.reindex(df['item_id'].unique()).fillna(0)
.reset_index('revenue').assign(year=year)
print(res)
item_id revenue year
0 1 12.0 2013
1 2 5.0 2013
2 3 0.0 2013
answered Nov 16 '18 at 10:23
jppjpp
102k2165116
102k2165116
add a comment |
add a comment |
You can use groupby first then sum method to get the desire output.
df.groupby(['year', 'item_id']).sum().reset_index().drop('month', axis=1).set_index('item_id')
year revenue
item_id
1 2013 18.0
2 2013 5.0
3 2013 5.0
1 2014 10.0
This doesn't match OP's desired output.
– jpp
Nov 16 '18 at 10:24
the desired output in question has mistake, for year2013, summing uprevenuegive18instead of12
– has
Nov 16 '18 at 10:26
Ah, so I think it's a typo (now corrected).
– jpp
Nov 16 '18 at 10:27
add a comment |
You can use groupby first then sum method to get the desire output.
df.groupby(['year', 'item_id']).sum().reset_index().drop('month', axis=1).set_index('item_id')
year revenue
item_id
1 2013 18.0
2 2013 5.0
3 2013 5.0
1 2014 10.0
This doesn't match OP's desired output.
– jpp
Nov 16 '18 at 10:24
the desired output in question has mistake, for year2013, summing uprevenuegive18instead of12
– has
Nov 16 '18 at 10:26
Ah, so I think it's a typo (now corrected).
– jpp
Nov 16 '18 at 10:27
add a comment |
You can use groupby first then sum method to get the desire output.
df.groupby(['year', 'item_id']).sum().reset_index().drop('month', axis=1).set_index('item_id')
year revenue
item_id
1 2013 18.0
2 2013 5.0
3 2013 5.0
1 2014 10.0
You can use groupby first then sum method to get the desire output.
df.groupby(['year', 'item_id']).sum().reset_index().drop('month', axis=1).set_index('item_id')
year revenue
item_id
1 2013 18.0
2 2013 5.0
3 2013 5.0
1 2014 10.0
answered Nov 16 '18 at 10:21
hashas
795519
795519
This doesn't match OP's desired output.
– jpp
Nov 16 '18 at 10:24
the desired output in question has mistake, for year2013, summing uprevenuegive18instead of12
– has
Nov 16 '18 at 10:26
Ah, so I think it's a typo (now corrected).
– jpp
Nov 16 '18 at 10:27
add a comment |
This doesn't match OP's desired output.
– jpp
Nov 16 '18 at 10:24
the desired output in question has mistake, for year2013, summing uprevenuegive18instead of12
– has
Nov 16 '18 at 10:26
Ah, so I think it's a typo (now corrected).
– jpp
Nov 16 '18 at 10:27
This doesn't match OP's desired output.
– jpp
Nov 16 '18 at 10:24
This doesn't match OP's desired output.
– jpp
Nov 16 '18 at 10:24
the desired output in question has mistake, for year
2013 , summing up revenue give 18 instead of 12– has
Nov 16 '18 at 10:26
the desired output in question has mistake, for year
2013 , summing up revenue give 18 instead of 12– has
Nov 16 '18 at 10:26
Ah, so I think it's a typo (now corrected).
– jpp
Nov 16 '18 at 10:27
Ah, so I think it's a typo (now corrected).
– jpp
Nov 16 '18 at 10:27
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53335691%2fhow-to-merge-several-rows-into-one-row-based-on-a-column-with-specific-value-in%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown

March 2014 or March 2013?
– AkshayNevrekar
Nov 16 '18 at 10:21
Sorry, it should be March 2013 like the last DataFrame above.
– FreAk Point
Nov 16 '18 at 10:27