Creating smaller dataframes from a larger dataframe using multiple filter criteria: using python and pandas












0















Good morning,



Basically I have 2 pandas dataframes from CSVs:



Dataframe 1: each row is a group where the row index is a geographical area code, and the columns are the top 5 most similar areas. e.g:



       0    1    2    3    4    5    
Rank
00C 00C 03H 02D 05H 02E 04E
00D 00D 02P 02X 01X 03R 06M


Dataframe 2: This is a larger dataframe with hospital activity numbers broken down by age group, gender and the geographical areas. e.g:



     MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
06M 75 59 43 48
00C 132 121 173 204
01X 84 63 124 102
03H 127 131 130 83
02P 93 89 208 151
02D 70 62 92 81
05H 96 76 52 32
00C 106 62 123 106
03R 75 59 43 48
02P 10 121 173 204
03R 84 63 124 102
03R 30 131 130 83
02E 93 89 208 151
06M 70 62 92 81
04E 96 76 52 32
00D 106 62 123 106


What I am trying to do is create smaller dataframes from Dataframe 2 - filtered by the groupings from Dataframe 1. Each geographical area code can appear in multiple lookups. I have a basic idea of for loops, but can't quite get it to work.



Dataframe Output 1:



    MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
00C 132 121 173 204
03H 127 131 130 83
02D 70 62 92 81
05H 96 76 52 32
00C 106 62 123 106
02E 93 89 208 151
04E 96 76 52 32


Dataframe Output 2:



    MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
06M 75 59 43 48
01X 84 63 124 102
02P 93 89 208 151
03R 75 59 43 48
02P 10 121 173 204
03R 84 63 124 102
03R 30 131 130 83
06M 70 62 92 81
00D 106 62 123 106


...



Hope this makes sense and any help would be appreciated.










share|improve this question




















  • 2





    Can't understand what you want! Can you please provide the one of expected output when you give a specific input!!

    – Rahul Agarwal
    Nov 15 '18 at 12:19











  • Hi Rahul. I am effectively looking to split the second dataframe into multiple dataframes - filtered so each of the smaller dataframes only include the data for each of the lookups in the first dataframe (with each column in the first dataframe representing the filter criteria). Dataframe 2 is actually much bigger (so includes rows for all the codes in the first dataframe as well as the 07L, 07M....etc already included in the example). Thanks!

    – Daviesa
    Nov 15 '18 at 14:03








  • 1





    Possible duplicate of Filter dataframe rows if value in column is in a set list of values

    – sophros
    Nov 15 '18 at 14:34











  • Could you create example with 5 rows and 3 columns with desired output? It's much easier to help you if we have Minimal, Complete, and Verifiable example.

    – zipa
    Nov 15 '18 at 14:36











  • Thanks all. I've changed the examples so the ouputs are correct for the input data. Notice that all the codes in Output 1 are in Dataframe 1's first column, and all the codes in Output 2 are in Dataframe 2's second column.

    – Daviesa
    Nov 15 '18 at 15:37
















0















Good morning,



Basically I have 2 pandas dataframes from CSVs:



Dataframe 1: each row is a group where the row index is a geographical area code, and the columns are the top 5 most similar areas. e.g:



       0    1    2    3    4    5    
Rank
00C 00C 03H 02D 05H 02E 04E
00D 00D 02P 02X 01X 03R 06M


Dataframe 2: This is a larger dataframe with hospital activity numbers broken down by age group, gender and the geographical areas. e.g:



     MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
06M 75 59 43 48
00C 132 121 173 204
01X 84 63 124 102
03H 127 131 130 83
02P 93 89 208 151
02D 70 62 92 81
05H 96 76 52 32
00C 106 62 123 106
03R 75 59 43 48
02P 10 121 173 204
03R 84 63 124 102
03R 30 131 130 83
02E 93 89 208 151
06M 70 62 92 81
04E 96 76 52 32
00D 106 62 123 106


What I am trying to do is create smaller dataframes from Dataframe 2 - filtered by the groupings from Dataframe 1. Each geographical area code can appear in multiple lookups. I have a basic idea of for loops, but can't quite get it to work.



Dataframe Output 1:



    MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
00C 132 121 173 204
03H 127 131 130 83
02D 70 62 92 81
05H 96 76 52 32
00C 106 62 123 106
02E 93 89 208 151
04E 96 76 52 32


Dataframe Output 2:



    MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
06M 75 59 43 48
01X 84 63 124 102
02P 93 89 208 151
03R 75 59 43 48
02P 10 121 173 204
03R 84 63 124 102
03R 30 131 130 83
06M 70 62 92 81
00D 106 62 123 106


...



Hope this makes sense and any help would be appreciated.










share|improve this question




















  • 2





    Can't understand what you want! Can you please provide the one of expected output when you give a specific input!!

    – Rahul Agarwal
    Nov 15 '18 at 12:19











  • Hi Rahul. I am effectively looking to split the second dataframe into multiple dataframes - filtered so each of the smaller dataframes only include the data for each of the lookups in the first dataframe (with each column in the first dataframe representing the filter criteria). Dataframe 2 is actually much bigger (so includes rows for all the codes in the first dataframe as well as the 07L, 07M....etc already included in the example). Thanks!

    – Daviesa
    Nov 15 '18 at 14:03








  • 1





    Possible duplicate of Filter dataframe rows if value in column is in a set list of values

    – sophros
    Nov 15 '18 at 14:34











  • Could you create example with 5 rows and 3 columns with desired output? It's much easier to help you if we have Minimal, Complete, and Verifiable example.

    – zipa
    Nov 15 '18 at 14:36











  • Thanks all. I've changed the examples so the ouputs are correct for the input data. Notice that all the codes in Output 1 are in Dataframe 1's first column, and all the codes in Output 2 are in Dataframe 2's second column.

    – Daviesa
    Nov 15 '18 at 15:37














0












0








0








Good morning,



Basically I have 2 pandas dataframes from CSVs:



Dataframe 1: each row is a group where the row index is a geographical area code, and the columns are the top 5 most similar areas. e.g:



       0    1    2    3    4    5    
Rank
00C 00C 03H 02D 05H 02E 04E
00D 00D 02P 02X 01X 03R 06M


Dataframe 2: This is a larger dataframe with hospital activity numbers broken down by age group, gender and the geographical areas. e.g:



     MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
06M 75 59 43 48
00C 132 121 173 204
01X 84 63 124 102
03H 127 131 130 83
02P 93 89 208 151
02D 70 62 92 81
05H 96 76 52 32
00C 106 62 123 106
03R 75 59 43 48
02P 10 121 173 204
03R 84 63 124 102
03R 30 131 130 83
02E 93 89 208 151
06M 70 62 92 81
04E 96 76 52 32
00D 106 62 123 106


What I am trying to do is create smaller dataframes from Dataframe 2 - filtered by the groupings from Dataframe 1. Each geographical area code can appear in multiple lookups. I have a basic idea of for loops, but can't quite get it to work.



Dataframe Output 1:



    MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
00C 132 121 173 204
03H 127 131 130 83
02D 70 62 92 81
05H 96 76 52 32
00C 106 62 123 106
02E 93 89 208 151
04E 96 76 52 32


Dataframe Output 2:



    MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
06M 75 59 43 48
01X 84 63 124 102
02P 93 89 208 151
03R 75 59 43 48
02P 10 121 173 204
03R 84 63 124 102
03R 30 131 130 83
06M 70 62 92 81
00D 106 62 123 106


...



Hope this makes sense and any help would be appreciated.










share|improve this question
















Good morning,



Basically I have 2 pandas dataframes from CSVs:



Dataframe 1: each row is a group where the row index is a geographical area code, and the columns are the top 5 most similar areas. e.g:



       0    1    2    3    4    5    
Rank
00C 00C 03H 02D 05H 02E 04E
00D 00D 02P 02X 01X 03R 06M


Dataframe 2: This is a larger dataframe with hospital activity numbers broken down by age group, gender and the geographical areas. e.g:



     MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
06M 75 59 43 48
00C 132 121 173 204
01X 84 63 124 102
03H 127 131 130 83
02P 93 89 208 151
02D 70 62 92 81
05H 96 76 52 32
00C 106 62 123 106
03R 75 59 43 48
02P 10 121 173 204
03R 84 63 124 102
03R 30 131 130 83
02E 93 89 208 151
06M 70 62 92 81
04E 96 76 52 32
00D 106 62 123 106


What I am trying to do is create smaller dataframes from Dataframe 2 - filtered by the groupings from Dataframe 1. Each geographical area code can appear in multiple lookups. I have a basic idea of for loops, but can't quite get it to work.



Dataframe Output 1:



    MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
00C 132 121 173 204
03H 127 131 130 83
02D 70 62 92 81
05H 96 76 52 32
00C 106 62 123 106
02E 93 89 208 151
04E 96 76 52 32


Dataframe Output 2:



    MALE_0-4  MALE_5-9           FEMALE_80-84  FEMALE_85+
06M 75 59 43 48
01X 84 63 124 102
02P 93 89 208 151
03R 75 59 43 48
02P 10 121 173 204
03R 84 63 124 102
03R 30 131 130 83
06M 70 62 92 81
00D 106 62 123 106


...



Hope this makes sense and any help would be appreciated.







python pandas dataframe






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 16:51







Daviesa

















asked Nov 15 '18 at 11:54









DaviesaDaviesa

32




32








  • 2





    Can't understand what you want! Can you please provide the one of expected output when you give a specific input!!

    – Rahul Agarwal
    Nov 15 '18 at 12:19











  • Hi Rahul. I am effectively looking to split the second dataframe into multiple dataframes - filtered so each of the smaller dataframes only include the data for each of the lookups in the first dataframe (with each column in the first dataframe representing the filter criteria). Dataframe 2 is actually much bigger (so includes rows for all the codes in the first dataframe as well as the 07L, 07M....etc already included in the example). Thanks!

    – Daviesa
    Nov 15 '18 at 14:03








  • 1





    Possible duplicate of Filter dataframe rows if value in column is in a set list of values

    – sophros
    Nov 15 '18 at 14:34











  • Could you create example with 5 rows and 3 columns with desired output? It's much easier to help you if we have Minimal, Complete, and Verifiable example.

    – zipa
    Nov 15 '18 at 14:36











  • Thanks all. I've changed the examples so the ouputs are correct for the input data. Notice that all the codes in Output 1 are in Dataframe 1's first column, and all the codes in Output 2 are in Dataframe 2's second column.

    – Daviesa
    Nov 15 '18 at 15:37














  • 2





    Can't understand what you want! Can you please provide the one of expected output when you give a specific input!!

    – Rahul Agarwal
    Nov 15 '18 at 12:19











  • Hi Rahul. I am effectively looking to split the second dataframe into multiple dataframes - filtered so each of the smaller dataframes only include the data for each of the lookups in the first dataframe (with each column in the first dataframe representing the filter criteria). Dataframe 2 is actually much bigger (so includes rows for all the codes in the first dataframe as well as the 07L, 07M....etc already included in the example). Thanks!

    – Daviesa
    Nov 15 '18 at 14:03








  • 1





    Possible duplicate of Filter dataframe rows if value in column is in a set list of values

    – sophros
    Nov 15 '18 at 14:34











  • Could you create example with 5 rows and 3 columns with desired output? It's much easier to help you if we have Minimal, Complete, and Verifiable example.

    – zipa
    Nov 15 '18 at 14:36











  • Thanks all. I've changed the examples so the ouputs are correct for the input data. Notice that all the codes in Output 1 are in Dataframe 1's first column, and all the codes in Output 2 are in Dataframe 2's second column.

    – Daviesa
    Nov 15 '18 at 15:37








2




2





Can't understand what you want! Can you please provide the one of expected output when you give a specific input!!

– Rahul Agarwal
Nov 15 '18 at 12:19





Can't understand what you want! Can you please provide the one of expected output when you give a specific input!!

– Rahul Agarwal
Nov 15 '18 at 12:19













Hi Rahul. I am effectively looking to split the second dataframe into multiple dataframes - filtered so each of the smaller dataframes only include the data for each of the lookups in the first dataframe (with each column in the first dataframe representing the filter criteria). Dataframe 2 is actually much bigger (so includes rows for all the codes in the first dataframe as well as the 07L, 07M....etc already included in the example). Thanks!

– Daviesa
Nov 15 '18 at 14:03







Hi Rahul. I am effectively looking to split the second dataframe into multiple dataframes - filtered so each of the smaller dataframes only include the data for each of the lookups in the first dataframe (with each column in the first dataframe representing the filter criteria). Dataframe 2 is actually much bigger (so includes rows for all the codes in the first dataframe as well as the 07L, 07M....etc already included in the example). Thanks!

– Daviesa
Nov 15 '18 at 14:03






1




1





Possible duplicate of Filter dataframe rows if value in column is in a set list of values

– sophros
Nov 15 '18 at 14:34





Possible duplicate of Filter dataframe rows if value in column is in a set list of values

– sophros
Nov 15 '18 at 14:34













Could you create example with 5 rows and 3 columns with desired output? It's much easier to help you if we have Minimal, Complete, and Verifiable example.

– zipa
Nov 15 '18 at 14:36





Could you create example with 5 rows and 3 columns with desired output? It's much easier to help you if we have Minimal, Complete, and Verifiable example.

– zipa
Nov 15 '18 at 14:36













Thanks all. I've changed the examples so the ouputs are correct for the input data. Notice that all the codes in Output 1 are in Dataframe 1's first column, and all the codes in Output 2 are in Dataframe 2's second column.

– Daviesa
Nov 15 '18 at 15:37





Thanks all. I've changed the examples so the ouputs are correct for the input data. Notice that all the codes in Output 1 are in Dataframe 1's first column, and all the codes in Output 2 are in Dataframe 2's second column.

– Daviesa
Nov 15 '18 at 15:37












2 Answers
2






active

oldest

votes


















0














Going by the linked duplicate question this is what you should use (sketchy):



for _, row in df1.iterrows():
broken_down = df2[df2['region'].isin(row)]





share|improve this answer


























  • Many thanks, but this returned an error: 'only list-like objects are allowed to be passed to isin(), you passed a [str]'

    – Daviesa
    Nov 15 '18 at 15:42













  • As I wrote the solution was sketchy. With the correction it should no longer throw an error.

    – sophros
    Nov 15 '18 at 16:24











  • Fantastic - that's done the trick! Thanks for your help, this is so useful! Now I'm able to append them to a list of dataframes

    – Daviesa
    Nov 15 '18 at 16:56













  • I am glad I could help! Best of luck!

    – sophros
    Nov 15 '18 at 17:00



















0














Just adding the code to append to a list, just for future use. Thanks to sophros for solving:



broken_down = 
for _, row in df1.iterrows():
broken_down.append(df2[df2['region'].isin(row)])





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53318939%2fcreating-smaller-dataframes-from-a-larger-dataframe-using-multiple-filter-criter%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Going by the linked duplicate question this is what you should use (sketchy):



    for _, row in df1.iterrows():
    broken_down = df2[df2['region'].isin(row)]





    share|improve this answer


























    • Many thanks, but this returned an error: 'only list-like objects are allowed to be passed to isin(), you passed a [str]'

      – Daviesa
      Nov 15 '18 at 15:42













    • As I wrote the solution was sketchy. With the correction it should no longer throw an error.

      – sophros
      Nov 15 '18 at 16:24











    • Fantastic - that's done the trick! Thanks for your help, this is so useful! Now I'm able to append them to a list of dataframes

      – Daviesa
      Nov 15 '18 at 16:56













    • I am glad I could help! Best of luck!

      – sophros
      Nov 15 '18 at 17:00
















    0














    Going by the linked duplicate question this is what you should use (sketchy):



    for _, row in df1.iterrows():
    broken_down = df2[df2['region'].isin(row)]





    share|improve this answer


























    • Many thanks, but this returned an error: 'only list-like objects are allowed to be passed to isin(), you passed a [str]'

      – Daviesa
      Nov 15 '18 at 15:42













    • As I wrote the solution was sketchy. With the correction it should no longer throw an error.

      – sophros
      Nov 15 '18 at 16:24











    • Fantastic - that's done the trick! Thanks for your help, this is so useful! Now I'm able to append them to a list of dataframes

      – Daviesa
      Nov 15 '18 at 16:56













    • I am glad I could help! Best of luck!

      – sophros
      Nov 15 '18 at 17:00














    0












    0








    0







    Going by the linked duplicate question this is what you should use (sketchy):



    for _, row in df1.iterrows():
    broken_down = df2[df2['region'].isin(row)]





    share|improve this answer















    Going by the linked duplicate question this is what you should use (sketchy):



    for _, row in df1.iterrows():
    broken_down = df2[df2['region'].isin(row)]






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 15 '18 at 16:24

























    answered Nov 15 '18 at 14:36









    sophrossophros

    2,8351932




    2,8351932













    • Many thanks, but this returned an error: 'only list-like objects are allowed to be passed to isin(), you passed a [str]'

      – Daviesa
      Nov 15 '18 at 15:42













    • As I wrote the solution was sketchy. With the correction it should no longer throw an error.

      – sophros
      Nov 15 '18 at 16:24











    • Fantastic - that's done the trick! Thanks for your help, this is so useful! Now I'm able to append them to a list of dataframes

      – Daviesa
      Nov 15 '18 at 16:56













    • I am glad I could help! Best of luck!

      – sophros
      Nov 15 '18 at 17:00



















    • Many thanks, but this returned an error: 'only list-like objects are allowed to be passed to isin(), you passed a [str]'

      – Daviesa
      Nov 15 '18 at 15:42













    • As I wrote the solution was sketchy. With the correction it should no longer throw an error.

      – sophros
      Nov 15 '18 at 16:24











    • Fantastic - that's done the trick! Thanks for your help, this is so useful! Now I'm able to append them to a list of dataframes

      – Daviesa
      Nov 15 '18 at 16:56













    • I am glad I could help! Best of luck!

      – sophros
      Nov 15 '18 at 17:00

















    Many thanks, but this returned an error: 'only list-like objects are allowed to be passed to isin(), you passed a [str]'

    – Daviesa
    Nov 15 '18 at 15:42







    Many thanks, but this returned an error: 'only list-like objects are allowed to be passed to isin(), you passed a [str]'

    – Daviesa
    Nov 15 '18 at 15:42















    As I wrote the solution was sketchy. With the correction it should no longer throw an error.

    – sophros
    Nov 15 '18 at 16:24





    As I wrote the solution was sketchy. With the correction it should no longer throw an error.

    – sophros
    Nov 15 '18 at 16:24













    Fantastic - that's done the trick! Thanks for your help, this is so useful! Now I'm able to append them to a list of dataframes

    – Daviesa
    Nov 15 '18 at 16:56







    Fantastic - that's done the trick! Thanks for your help, this is so useful! Now I'm able to append them to a list of dataframes

    – Daviesa
    Nov 15 '18 at 16:56















    I am glad I could help! Best of luck!

    – sophros
    Nov 15 '18 at 17:00





    I am glad I could help! Best of luck!

    – sophros
    Nov 15 '18 at 17:00













    0














    Just adding the code to append to a list, just for future use. Thanks to sophros for solving:



    broken_down = 
    for _, row in df1.iterrows():
    broken_down.append(df2[df2['region'].isin(row)])





    share|improve this answer




























      0














      Just adding the code to append to a list, just for future use. Thanks to sophros for solving:



      broken_down = 
      for _, row in df1.iterrows():
      broken_down.append(df2[df2['region'].isin(row)])





      share|improve this answer


























        0












        0








        0







        Just adding the code to append to a list, just for future use. Thanks to sophros for solving:



        broken_down = 
        for _, row in df1.iterrows():
        broken_down.append(df2[df2['region'].isin(row)])





        share|improve this answer













        Just adding the code to append to a list, just for future use. Thanks to sophros for solving:



        broken_down = 
        for _, row in df1.iterrows():
        broken_down.append(df2[df2['region'].isin(row)])






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 15 '18 at 16:57









        DaviesaDaviesa

        32




        32






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53318939%2fcreating-smaller-dataframes-from-a-larger-dataframe-using-multiple-filter-criter%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Xamarin.iOS Cant Deploy on Iphone

            Glorious Revolution

            Dulmage-Mendelsohn matrix decomposition in Python