Python: how to drop duplicates with duplicates?
I have a dataframe like the following
df
Name Y
0 A 1
1 A 0
2 B 0
3 B 0
5 C 1
I want to drop the duplicates of Name
and keep the ones that have Y=1
such as:
df
Name Y
0 A 1
1 B 0
2 C 1
python pandas
add a comment |
I have a dataframe like the following
df
Name Y
0 A 1
1 A 0
2 B 0
3 B 0
5 C 1
I want to drop the duplicates of Name
and keep the ones that have Y=1
such as:
df
Name Y
0 A 1
1 B 0
2 C 1
python pandas
add a comment |
I have a dataframe like the following
df
Name Y
0 A 1
1 A 0
2 B 0
3 B 0
5 C 1
I want to drop the duplicates of Name
and keep the ones that have Y=1
such as:
df
Name Y
0 A 1
1 B 0
2 C 1
python pandas
I have a dataframe like the following
df
Name Y
0 A 1
1 A 0
2 B 0
3 B 0
5 C 1
I want to drop the duplicates of Name
and keep the ones that have Y=1
such as:
df
Name Y
0 A 1
1 B 0
2 C 1
python pandas
python pandas
asked Nov 16 '18 at 10:53
emaxemax
1,20531235
1,20531235
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
Use drop_duplicates
method,
df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])
1
drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates
– Matina G
Nov 16 '18 at 11:12
Agree, will etit
– Alessandro
Nov 16 '18 at 11:43
add a comment |
groupby
+ max
Assuming your Y
series consists only of 0
and 1
values:
res = df.groupby('Name', as_index=False)['Y'].max()
print(res)
Name Y
0 A 1
1 B 0
2 C 1
add a comment |
Does 'Y' column contain only 0-1? In that case, you can try the following :
df = df.sort_values(['Y'], ascending= False)
df = df.drop_duplicates(['Name'])
add a comment |
Try this:
In [2358]: df.groupby('Name')['Y'].max()
Out[2358]:
Name
A 1
B 0
C 1
Name: Y, dtype: int64
add a comment |
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53336391%2fpython-how-to-drop-duplicates-with-duplicates%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use drop_duplicates
method,
df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])
1
drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates
– Matina G
Nov 16 '18 at 11:12
Agree, will etit
– Alessandro
Nov 16 '18 at 11:43
add a comment |
Use drop_duplicates
method,
df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])
1
drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates
– Matina G
Nov 16 '18 at 11:12
Agree, will etit
– Alessandro
Nov 16 '18 at 11:43
add a comment |
Use drop_duplicates
method,
df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])
Use drop_duplicates
method,
df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])
edited Nov 16 '18 at 11:43
answered Nov 16 '18 at 10:56
AlessandroAlessandro
480617
480617
1
drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates
– Matina G
Nov 16 '18 at 11:12
Agree, will etit
– Alessandro
Nov 16 '18 at 11:43
add a comment |
1
drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates
– Matina G
Nov 16 '18 at 11:12
Agree, will etit
– Alessandro
Nov 16 '18 at 11:43
1
1
drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates
– Matina G
Nov 16 '18 at 11:12
drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates
– Matina G
Nov 16 '18 at 11:12
Agree, will etit
– Alessandro
Nov 16 '18 at 11:43
Agree, will etit
– Alessandro
Nov 16 '18 at 11:43
add a comment |
groupby
+ max
Assuming your Y
series consists only of 0
and 1
values:
res = df.groupby('Name', as_index=False)['Y'].max()
print(res)
Name Y
0 A 1
1 B 0
2 C 1
add a comment |
groupby
+ max
Assuming your Y
series consists only of 0
and 1
values:
res = df.groupby('Name', as_index=False)['Y'].max()
print(res)
Name Y
0 A 1
1 B 0
2 C 1
add a comment |
groupby
+ max
Assuming your Y
series consists only of 0
and 1
values:
res = df.groupby('Name', as_index=False)['Y'].max()
print(res)
Name Y
0 A 1
1 B 0
2 C 1
groupby
+ max
Assuming your Y
series consists only of 0
and 1
values:
res = df.groupby('Name', as_index=False)['Y'].max()
print(res)
Name Y
0 A 1
1 B 0
2 C 1
answered Nov 16 '18 at 11:07
jppjpp
102k2165116
102k2165116
add a comment |
add a comment |
Does 'Y' column contain only 0-1? In that case, you can try the following :
df = df.sort_values(['Y'], ascending= False)
df = df.drop_duplicates(['Name'])
add a comment |
Does 'Y' column contain only 0-1? In that case, you can try the following :
df = df.sort_values(['Y'], ascending= False)
df = df.drop_duplicates(['Name'])
add a comment |
Does 'Y' column contain only 0-1? In that case, you can try the following :
df = df.sort_values(['Y'], ascending= False)
df = df.drop_duplicates(['Name'])
Does 'Y' column contain only 0-1? In that case, you can try the following :
df = df.sort_values(['Y'], ascending= False)
df = df.drop_duplicates(['Name'])
answered Nov 16 '18 at 11:09
Matina GMatina G
612213
612213
add a comment |
add a comment |
Try this:
In [2358]: df.groupby('Name')['Y'].max()
Out[2358]:
Name
A 1
B 0
C 1
Name: Y, dtype: int64
add a comment |
Try this:
In [2358]: df.groupby('Name')['Y'].max()
Out[2358]:
Name
A 1
B 0
C 1
Name: Y, dtype: int64
add a comment |
Try this:
In [2358]: df.groupby('Name')['Y'].max()
Out[2358]:
Name
A 1
B 0
C 1
Name: Y, dtype: int64
Try this:
In [2358]: df.groupby('Name')['Y'].max()
Out[2358]:
Name
A 1
B 0
C 1
Name: Y, dtype: int64
answered Nov 16 '18 at 11:08
Mayank PorwalMayank Porwal
5,0182725
5,0182725
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53336391%2fpython-how-to-drop-duplicates-with-duplicates%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown