Python: how to drop duplicates with duplicates?












1















I have a dataframe like the following



df
Name Y
0 A 1
1 A 0
2 B 0
3 B 0
5 C 1


I want to drop the duplicates of Name and keep the ones that have Y=1 such as:



df
Name Y
0 A 1
1 B 0
2 C 1









share|improve this question



























    1















    I have a dataframe like the following



    df
    Name Y
    0 A 1
    1 A 0
    2 B 0
    3 B 0
    5 C 1


    I want to drop the duplicates of Name and keep the ones that have Y=1 such as:



    df
    Name Y
    0 A 1
    1 B 0
    2 C 1









    share|improve this question

























      1












      1








      1


      0






      I have a dataframe like the following



      df
      Name Y
      0 A 1
      1 A 0
      2 B 0
      3 B 0
      5 C 1


      I want to drop the duplicates of Name and keep the ones that have Y=1 such as:



      df
      Name Y
      0 A 1
      1 B 0
      2 C 1









      share|improve this question














      I have a dataframe like the following



      df
      Name Y
      0 A 1
      1 A 0
      2 B 0
      3 B 0
      5 C 1


      I want to drop the duplicates of Name and keep the ones that have Y=1 such as:



      df
      Name Y
      0 A 1
      1 B 0
      2 C 1






      python pandas






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 16 '18 at 10:53









      emaxemax

      1,20531235




      1,20531235
























          4 Answers
          4






          active

          oldest

          votes


















          2














          Use drop_duplicates method,



          df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])





          share|improve this answer





















          • 1





            drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates

            – Matina G
            Nov 16 '18 at 11:12











          • Agree, will etit

            – Alessandro
            Nov 16 '18 at 11:43



















          2















          groupby + max



          Assuming your Y series consists only of 0 and 1 values:



          res = df.groupby('Name', as_index=False)['Y'].max()

          print(res)

          Name Y
          0 A 1
          1 B 0
          2 C 1





          share|improve this answer































            1














            Does 'Y' column contain only 0-1? In that case, you can try the following :



            df = df.sort_values(['Y'], ascending= False)
            df = df.drop_duplicates(['Name'])





            share|improve this answer































              0














              Try this:



              In [2358]: df.groupby('Name')['Y'].max()
              Out[2358]:
              Name
              A 1
              B 0
              C 1
              Name: Y, dtype: int64





              share|improve this answer
























                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53336391%2fpython-how-to-drop-duplicates-with-duplicates%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                4 Answers
                4






                active

                oldest

                votes








                4 Answers
                4






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                2














                Use drop_duplicates method,



                df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])





                share|improve this answer





















                • 1





                  drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates

                  – Matina G
                  Nov 16 '18 at 11:12











                • Agree, will etit

                  – Alessandro
                  Nov 16 '18 at 11:43
















                2














                Use drop_duplicates method,



                df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])





                share|improve this answer





















                • 1





                  drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates

                  – Matina G
                  Nov 16 '18 at 11:12











                • Agree, will etit

                  – Alessandro
                  Nov 16 '18 at 11:43














                2












                2








                2







                Use drop_duplicates method,



                df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])





                share|improve this answer















                Use drop_duplicates method,



                df.sort_values('Y', ascending= False).drop_duplicates(subset=['Name'])






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 16 '18 at 11:43

























                answered Nov 16 '18 at 10:56









                AlessandroAlessandro

                480617




                480617








                • 1





                  drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates

                  – Matina G
                  Nov 16 '18 at 11:12











                • Agree, will etit

                  – Alessandro
                  Nov 16 '18 at 11:43














                • 1





                  drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates

                  – Matina G
                  Nov 16 '18 at 11:12











                • Agree, will etit

                  – Alessandro
                  Nov 16 '18 at 11:43








                1




                1





                drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates

                – Matina G
                Nov 16 '18 at 11:12





                drop_duplicates has by default keep ='first' , so your proposition will keep 0's instead of 1's. You should either sort in descending ordrer , or add a keep='last' argument in drop duplicates

                – Matina G
                Nov 16 '18 at 11:12













                Agree, will etit

                – Alessandro
                Nov 16 '18 at 11:43





                Agree, will etit

                – Alessandro
                Nov 16 '18 at 11:43













                2















                groupby + max



                Assuming your Y series consists only of 0 and 1 values:



                res = df.groupby('Name', as_index=False)['Y'].max()

                print(res)

                Name Y
                0 A 1
                1 B 0
                2 C 1





                share|improve this answer




























                  2















                  groupby + max



                  Assuming your Y series consists only of 0 and 1 values:



                  res = df.groupby('Name', as_index=False)['Y'].max()

                  print(res)

                  Name Y
                  0 A 1
                  1 B 0
                  2 C 1





                  share|improve this answer


























                    2












                    2








                    2








                    groupby + max



                    Assuming your Y series consists only of 0 and 1 values:



                    res = df.groupby('Name', as_index=False)['Y'].max()

                    print(res)

                    Name Y
                    0 A 1
                    1 B 0
                    2 C 1





                    share|improve this answer














                    groupby + max



                    Assuming your Y series consists only of 0 and 1 values:



                    res = df.groupby('Name', as_index=False)['Y'].max()

                    print(res)

                    Name Y
                    0 A 1
                    1 B 0
                    2 C 1






                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Nov 16 '18 at 11:07









                    jppjpp

                    102k2165116




                    102k2165116























                        1














                        Does 'Y' column contain only 0-1? In that case, you can try the following :



                        df = df.sort_values(['Y'], ascending= False)
                        df = df.drop_duplicates(['Name'])





                        share|improve this answer




























                          1














                          Does 'Y' column contain only 0-1? In that case, you can try the following :



                          df = df.sort_values(['Y'], ascending= False)
                          df = df.drop_duplicates(['Name'])





                          share|improve this answer


























                            1












                            1








                            1







                            Does 'Y' column contain only 0-1? In that case, you can try the following :



                            df = df.sort_values(['Y'], ascending= False)
                            df = df.drop_duplicates(['Name'])





                            share|improve this answer













                            Does 'Y' column contain only 0-1? In that case, you can try the following :



                            df = df.sort_values(['Y'], ascending= False)
                            df = df.drop_duplicates(['Name'])






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Nov 16 '18 at 11:09









                            Matina GMatina G

                            612213




                            612213























                                0














                                Try this:



                                In [2358]: df.groupby('Name')['Y'].max()
                                Out[2358]:
                                Name
                                A 1
                                B 0
                                C 1
                                Name: Y, dtype: int64





                                share|improve this answer




























                                  0














                                  Try this:



                                  In [2358]: df.groupby('Name')['Y'].max()
                                  Out[2358]:
                                  Name
                                  A 1
                                  B 0
                                  C 1
                                  Name: Y, dtype: int64





                                  share|improve this answer


























                                    0












                                    0








                                    0







                                    Try this:



                                    In [2358]: df.groupby('Name')['Y'].max()
                                    Out[2358]:
                                    Name
                                    A 1
                                    B 0
                                    C 1
                                    Name: Y, dtype: int64





                                    share|improve this answer













                                    Try this:



                                    In [2358]: df.groupby('Name')['Y'].max()
                                    Out[2358]:
                                    Name
                                    A 1
                                    B 0
                                    C 1
                                    Name: Y, dtype: int64






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Nov 16 '18 at 11:08









                                    Mayank PorwalMayank Porwal

                                    5,0182725




                                    5,0182725






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53336391%2fpython-how-to-drop-duplicates-with-duplicates%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Xamarin.iOS Cant Deploy on Iphone

                                        Glorious Revolution

                                        Dulmage-Mendelsohn matrix decomposition in Python