Filter the rows in a list of tuples using numpy












2














I am looking for a quicker way to filter out the list of tuples, using numpy and avoiding loops.



A = [(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)]


Now I have to filter it based on the second element, i.e. 4.
If '4' is present form one list 'B', if not form list 'C'.



That is:



B = [(27157, 4),(2857, 4)]
C = [(24814, 0),(1047, 2),(18265, 2),(23854, 2),(36881, 0)]









share|improve this question
























  • If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
    – hpaulj
    Nov 12 at 20:04










  • Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
    – Gurpreet.S
    Nov 12 at 22:00










  • The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
    – hpaulj
    Nov 12 at 22:17
















2














I am looking for a quicker way to filter out the list of tuples, using numpy and avoiding loops.



A = [(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)]


Now I have to filter it based on the second element, i.e. 4.
If '4' is present form one list 'B', if not form list 'C'.



That is:



B = [(27157, 4),(2857, 4)]
C = [(24814, 0),(1047, 2),(18265, 2),(23854, 2),(36881, 0)]









share|improve this question
























  • If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
    – hpaulj
    Nov 12 at 20:04










  • Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
    – Gurpreet.S
    Nov 12 at 22:00










  • The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
    – hpaulj
    Nov 12 at 22:17














2












2








2







I am looking for a quicker way to filter out the list of tuples, using numpy and avoiding loops.



A = [(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)]


Now I have to filter it based on the second element, i.e. 4.
If '4' is present form one list 'B', if not form list 'C'.



That is:



B = [(27157, 4),(2857, 4)]
C = [(24814, 0),(1047, 2),(18265, 2),(23854, 2),(36881, 0)]









share|improve this question















I am looking for a quicker way to filter out the list of tuples, using numpy and avoiding loops.



A = [(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)]


Now I have to filter it based on the second element, i.e. 4.
If '4' is present form one list 'B', if not form list 'C'.



That is:



B = [(27157, 4),(2857, 4)]
C = [(24814, 0),(1047, 2),(18265, 2),(23854, 2),(36881, 0)]






python arrays numpy indexing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 at 18:13









jpp

90.7k2052101




90.7k2052101










asked Nov 12 at 17:37









Gurpreet.S

496




496












  • If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
    – hpaulj
    Nov 12 at 20:04










  • Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
    – Gurpreet.S
    Nov 12 at 22:00










  • The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
    – hpaulj
    Nov 12 at 22:17


















  • If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
    – hpaulj
    Nov 12 at 20:04










  • Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
    – Gurpreet.S
    Nov 12 at 22:00










  • The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
    – hpaulj
    Nov 12 at 22:17
















If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
– hpaulj
Nov 12 at 20:04




If it really is a list (not already an array), a list operation probably will be fastest. There's a significant overhead when creating an array from a list.
– hpaulj
Nov 12 at 20:04












Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
– Gurpreet.S
Nov 12 at 22:00




Yes, currently its the list of tuples, on which I am performing the operation using for loop, but I am looking out for a quicker method, so thought of using numpy.
– Gurpreet.S
Nov 12 at 22:00












The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
– hpaulj
Nov 12 at 22:17




The kind of thing you are trying to isn't particularly fast, even if you start with an array. But do your own time tests,
– hpaulj
Nov 12 at 22:17












2 Answers
2






active

oldest

votes


















2














With NumPy, you can use Boolean indexing to return arrays:



mask = A[:, 1] == 4
B = A[mask]
C = A[~mask]


This requires your input to be a NumPy array:



A = np.array([(27157, 4),
(24814, 0),
(1047, 2),
(18265, 2),
(2857, 4),
(23854, 2),
(36881, 0)])





share|improve this answer





























    0














    To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



    dt=dtype([('val',int),('key',int)])
    B=ndarray(len(A),dt,array(A))

    B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
    B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...





    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53267343%2ffilter-the-rows-in-a-list-of-tuples-using-numpy%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      2














      With NumPy, you can use Boolean indexing to return arrays:



      mask = A[:, 1] == 4
      B = A[mask]
      C = A[~mask]


      This requires your input to be a NumPy array:



      A = np.array([(27157, 4),
      (24814, 0),
      (1047, 2),
      (18265, 2),
      (2857, 4),
      (23854, 2),
      (36881, 0)])





      share|improve this answer


























        2














        With NumPy, you can use Boolean indexing to return arrays:



        mask = A[:, 1] == 4
        B = A[mask]
        C = A[~mask]


        This requires your input to be a NumPy array:



        A = np.array([(27157, 4),
        (24814, 0),
        (1047, 2),
        (18265, 2),
        (2857, 4),
        (23854, 2),
        (36881, 0)])





        share|improve this answer
























          2












          2








          2






          With NumPy, you can use Boolean indexing to return arrays:



          mask = A[:, 1] == 4
          B = A[mask]
          C = A[~mask]


          This requires your input to be a NumPy array:



          A = np.array([(27157, 4),
          (24814, 0),
          (1047, 2),
          (18265, 2),
          (2857, 4),
          (23854, 2),
          (36881, 0)])





          share|improve this answer












          With NumPy, you can use Boolean indexing to return arrays:



          mask = A[:, 1] == 4
          B = A[mask]
          C = A[~mask]


          This requires your input to be a NumPy array:



          A = np.array([(27157, 4),
          (24814, 0),
          (1047, 2),
          (18265, 2),
          (2857, 4),
          (23854, 2),
          (36881, 0)])






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 12 at 17:38









          jpp

          90.7k2052101




          90.7k2052101

























              0














              To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



              dt=dtype([('val',int),('key',int)])
              B=ndarray(len(A),dt,array(A))

              B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
              B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...





              share|improve this answer


























                0














                To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



                dt=dtype([('val',int),('key',int)])
                B=ndarray(len(A),dt,array(A))

                B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
                B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...





                share|improve this answer
























                  0












                  0








                  0






                  To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



                  dt=dtype([('val',int),('key',int)])
                  B=ndarray(len(A),dt,array(A))

                  B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
                  B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...





                  share|improve this answer












                  To be fast you must turn you list of tuple in a more efficient data structure. if you want to keep tuples, you can use a structured array :



                  dt=dtype([('val',int),('key',int)])
                  B=ndarray(len(A),dt,array(A))

                  B[B['key']==4] #--> array([(27157, 4), ( 2857, 4)],...
                  B[B['key']!=4] #--> array([(24814, 0), ( 1047, 2), (18265, 2), (23854, 2), (36881, 0)],...






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 12 at 18:11









                  B. M.

                  12.8k11934




                  12.8k11934






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53267343%2ffilter-the-rows-in-a-list-of-tuples-using-numpy%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Bressuire

                      Vorschmack

                      Quarantine