How to find the pairwise differences between rows of two very large matrices using numpy?












3















Given two matrices, I want to compute the pairwise differences between all rows. Each matrix has 1000 rows and 100 columns so they are fairly large. I tried using a for loop and pure broadcasting but the for loop seem to be working faster. Am I doing something wrong? Here is the code:



from numpy import *
A = random.randn(1000,100)
B = random.randn(1000,100)

start = time.time()
for a in A:
sum((a - B)**2,1)
print time.time() - start

# pure broadcasting
start = time.time()
((A[:,newaxis,:] - B)**2).sum(-1)
print time.time() - start


The broadcasting method takes about 1 second longer and it's even longer for large matrices. Any idea how to speed this up purely using numpy?










share|improve this question



























    3















    Given two matrices, I want to compute the pairwise differences between all rows. Each matrix has 1000 rows and 100 columns so they are fairly large. I tried using a for loop and pure broadcasting but the for loop seem to be working faster. Am I doing something wrong? Here is the code:



    from numpy import *
    A = random.randn(1000,100)
    B = random.randn(1000,100)

    start = time.time()
    for a in A:
    sum((a - B)**2,1)
    print time.time() - start

    # pure broadcasting
    start = time.time()
    ((A[:,newaxis,:] - B)**2).sum(-1)
    print time.time() - start


    The broadcasting method takes about 1 second longer and it's even longer for large matrices. Any idea how to speed this up purely using numpy?










    share|improve this question

























      3












      3








      3


      1






      Given two matrices, I want to compute the pairwise differences between all rows. Each matrix has 1000 rows and 100 columns so they are fairly large. I tried using a for loop and pure broadcasting but the for loop seem to be working faster. Am I doing something wrong? Here is the code:



      from numpy import *
      A = random.randn(1000,100)
      B = random.randn(1000,100)

      start = time.time()
      for a in A:
      sum((a - B)**2,1)
      print time.time() - start

      # pure broadcasting
      start = time.time()
      ((A[:,newaxis,:] - B)**2).sum(-1)
      print time.time() - start


      The broadcasting method takes about 1 second longer and it's even longer for large matrices. Any idea how to speed this up purely using numpy?










      share|improve this question














      Given two matrices, I want to compute the pairwise differences between all rows. Each matrix has 1000 rows and 100 columns so they are fairly large. I tried using a for loop and pure broadcasting but the for loop seem to be working faster. Am I doing something wrong? Here is the code:



      from numpy import *
      A = random.randn(1000,100)
      B = random.randn(1000,100)

      start = time.time()
      for a in A:
      sum((a - B)**2,1)
      print time.time() - start

      # pure broadcasting
      start = time.time()
      ((A[:,newaxis,:] - B)**2).sum(-1)
      print time.time() - start


      The broadcasting method takes about 1 second longer and it's even longer for large matrices. Any idea how to speed this up purely using numpy?







      python numpy matrix numpy-broadcasting






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 24 '17 at 4:18









      kchow462kchow462

      483419




      483419
























          3 Answers
          3






          active

          oldest

          votes


















          5














          Here's another way to perform :




          (a-b)^2 = a^2 + b^2 - 2ab




          with np.einsum for the first two terms and dot-product for the third one -



          import numpy as np

          np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) - 2*np.dot(A,B.T)


          Runtime test



          Approaches -



          def loopy_app(A,B):
          m,n = A.shape[0], B.shape[0]
          out = np.empty((m,n))
          for i,a in enumerate(A):
          out[i] = np.sum((a - B)**2,1)
          return out

          def broadcasting_app(A,B):
          return ((A[:,np.newaxis,:] - B)**2).sum(-1)

          # @Paul Panzer's soln
          def outer_sum_dot_app(A,B):
          return np.add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*np.dot(A,B.T)

          # @Daniel Forsman's soln
          def einsum_all_app(A,B):
          return np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:],
          A[:,None,:] - B[None,:,:])

          # Proposed in this post
          def outer_einsum_dot_app(A,B):
          return np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) -
          2*np.dot(A,B.T)


          Timings -



          In [51]: A = np.random.randn(1000,100)
          ...: B = np.random.randn(1000,100)
          ...:

          In [52]: %timeit loopy_app(A,B)
          ...: %timeit broadcasting_app(A,B)
          ...: %timeit outer_sum_dot_app(A,B)
          ...: %timeit einsum_all_app(A,B)
          ...: %timeit outer_einsum_dot_app(A,B)
          ...:
          10 loops, best of 3: 136 ms per loop
          1 loops, best of 3: 302 ms per loop
          100 loops, best of 3: 8.51 ms per loop
          1 loops, best of 3: 341 ms per loop
          100 loops, best of 3: 8.38 ms per loop





          share|improve this answer


























          • Ha! I beat Divakar to the einsum answer! Of course mine seems to be the wrong way to use it if you want faster solution . . .

            – Daniel F
            Mar 24 '17 at 8:32













          • @DanielForsman You just need more practice, you will get there eventually! :)

            – Divakar
            Mar 24 '17 at 8:33











          • Considering how much of my current pile of code relies on computing Cartesian distances quickly, that algebra trick is useful enough that I don't much mind :)

            – Daniel F
            Mar 24 '17 at 8:36



















          2














          Here is a solution which avoids both the loop and the large intermediates:



          from numpy import *
          import time

          A = random.randn(1000,100)
          B = random.randn(1000,100)

          start = time.time()
          for a in A:
          sum((a - B)**2,1)
          print time.time() - start

          # pure broadcasting
          start = time.time()
          ((A[:,newaxis,:] - B)**2).sum(-1)
          print time.time() - start

          #matmul
          start = time.time()
          add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*dot(A,B.T)
          print time.time() - start


          Prints:



          0.546781778336
          0.674743175507
          0.10723400116





          share|improve this answer































            2














            Another job for np.einsum



            np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:], A[:,None,:] - B[None,:,:])





            share|improve this answer























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f42991347%2fhow-to-find-the-pairwise-differences-between-rows-of-two-very-large-matrices-usi%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              5














              Here's another way to perform :




              (a-b)^2 = a^2 + b^2 - 2ab




              with np.einsum for the first two terms and dot-product for the third one -



              import numpy as np

              np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) - 2*np.dot(A,B.T)


              Runtime test



              Approaches -



              def loopy_app(A,B):
              m,n = A.shape[0], B.shape[0]
              out = np.empty((m,n))
              for i,a in enumerate(A):
              out[i] = np.sum((a - B)**2,1)
              return out

              def broadcasting_app(A,B):
              return ((A[:,np.newaxis,:] - B)**2).sum(-1)

              # @Paul Panzer's soln
              def outer_sum_dot_app(A,B):
              return np.add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*np.dot(A,B.T)

              # @Daniel Forsman's soln
              def einsum_all_app(A,B):
              return np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:],
              A[:,None,:] - B[None,:,:])

              # Proposed in this post
              def outer_einsum_dot_app(A,B):
              return np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) -
              2*np.dot(A,B.T)


              Timings -



              In [51]: A = np.random.randn(1000,100)
              ...: B = np.random.randn(1000,100)
              ...:

              In [52]: %timeit loopy_app(A,B)
              ...: %timeit broadcasting_app(A,B)
              ...: %timeit outer_sum_dot_app(A,B)
              ...: %timeit einsum_all_app(A,B)
              ...: %timeit outer_einsum_dot_app(A,B)
              ...:
              10 loops, best of 3: 136 ms per loop
              1 loops, best of 3: 302 ms per loop
              100 loops, best of 3: 8.51 ms per loop
              1 loops, best of 3: 341 ms per loop
              100 loops, best of 3: 8.38 ms per loop





              share|improve this answer


























              • Ha! I beat Divakar to the einsum answer! Of course mine seems to be the wrong way to use it if you want faster solution . . .

                – Daniel F
                Mar 24 '17 at 8:32













              • @DanielForsman You just need more practice, you will get there eventually! :)

                – Divakar
                Mar 24 '17 at 8:33











              • Considering how much of my current pile of code relies on computing Cartesian distances quickly, that algebra trick is useful enough that I don't much mind :)

                – Daniel F
                Mar 24 '17 at 8:36
















              5














              Here's another way to perform :




              (a-b)^2 = a^2 + b^2 - 2ab




              with np.einsum for the first two terms and dot-product for the third one -



              import numpy as np

              np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) - 2*np.dot(A,B.T)


              Runtime test



              Approaches -



              def loopy_app(A,B):
              m,n = A.shape[0], B.shape[0]
              out = np.empty((m,n))
              for i,a in enumerate(A):
              out[i] = np.sum((a - B)**2,1)
              return out

              def broadcasting_app(A,B):
              return ((A[:,np.newaxis,:] - B)**2).sum(-1)

              # @Paul Panzer's soln
              def outer_sum_dot_app(A,B):
              return np.add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*np.dot(A,B.T)

              # @Daniel Forsman's soln
              def einsum_all_app(A,B):
              return np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:],
              A[:,None,:] - B[None,:,:])

              # Proposed in this post
              def outer_einsum_dot_app(A,B):
              return np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) -
              2*np.dot(A,B.T)


              Timings -



              In [51]: A = np.random.randn(1000,100)
              ...: B = np.random.randn(1000,100)
              ...:

              In [52]: %timeit loopy_app(A,B)
              ...: %timeit broadcasting_app(A,B)
              ...: %timeit outer_sum_dot_app(A,B)
              ...: %timeit einsum_all_app(A,B)
              ...: %timeit outer_einsum_dot_app(A,B)
              ...:
              10 loops, best of 3: 136 ms per loop
              1 loops, best of 3: 302 ms per loop
              100 loops, best of 3: 8.51 ms per loop
              1 loops, best of 3: 341 ms per loop
              100 loops, best of 3: 8.38 ms per loop





              share|improve this answer


























              • Ha! I beat Divakar to the einsum answer! Of course mine seems to be the wrong way to use it if you want faster solution . . .

                – Daniel F
                Mar 24 '17 at 8:32













              • @DanielForsman You just need more practice, you will get there eventually! :)

                – Divakar
                Mar 24 '17 at 8:33











              • Considering how much of my current pile of code relies on computing Cartesian distances quickly, that algebra trick is useful enough that I don't much mind :)

                – Daniel F
                Mar 24 '17 at 8:36














              5












              5








              5







              Here's another way to perform :




              (a-b)^2 = a^2 + b^2 - 2ab




              with np.einsum for the first two terms and dot-product for the third one -



              import numpy as np

              np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) - 2*np.dot(A,B.T)


              Runtime test



              Approaches -



              def loopy_app(A,B):
              m,n = A.shape[0], B.shape[0]
              out = np.empty((m,n))
              for i,a in enumerate(A):
              out[i] = np.sum((a - B)**2,1)
              return out

              def broadcasting_app(A,B):
              return ((A[:,np.newaxis,:] - B)**2).sum(-1)

              # @Paul Panzer's soln
              def outer_sum_dot_app(A,B):
              return np.add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*np.dot(A,B.T)

              # @Daniel Forsman's soln
              def einsum_all_app(A,B):
              return np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:],
              A[:,None,:] - B[None,:,:])

              # Proposed in this post
              def outer_einsum_dot_app(A,B):
              return np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) -
              2*np.dot(A,B.T)


              Timings -



              In [51]: A = np.random.randn(1000,100)
              ...: B = np.random.randn(1000,100)
              ...:

              In [52]: %timeit loopy_app(A,B)
              ...: %timeit broadcasting_app(A,B)
              ...: %timeit outer_sum_dot_app(A,B)
              ...: %timeit einsum_all_app(A,B)
              ...: %timeit outer_einsum_dot_app(A,B)
              ...:
              10 loops, best of 3: 136 ms per loop
              1 loops, best of 3: 302 ms per loop
              100 loops, best of 3: 8.51 ms per loop
              1 loops, best of 3: 341 ms per loop
              100 loops, best of 3: 8.38 ms per loop





              share|improve this answer















              Here's another way to perform :




              (a-b)^2 = a^2 + b^2 - 2ab




              with np.einsum for the first two terms and dot-product for the third one -



              import numpy as np

              np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) - 2*np.dot(A,B.T)


              Runtime test



              Approaches -



              def loopy_app(A,B):
              m,n = A.shape[0], B.shape[0]
              out = np.empty((m,n))
              for i,a in enumerate(A):
              out[i] = np.sum((a - B)**2,1)
              return out

              def broadcasting_app(A,B):
              return ((A[:,np.newaxis,:] - B)**2).sum(-1)

              # @Paul Panzer's soln
              def outer_sum_dot_app(A,B):
              return np.add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*np.dot(A,B.T)

              # @Daniel Forsman's soln
              def einsum_all_app(A,B):
              return np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:],
              A[:,None,:] - B[None,:,:])

              # Proposed in this post
              def outer_einsum_dot_app(A,B):
              return np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) -
              2*np.dot(A,B.T)


              Timings -



              In [51]: A = np.random.randn(1000,100)
              ...: B = np.random.randn(1000,100)
              ...:

              In [52]: %timeit loopy_app(A,B)
              ...: %timeit broadcasting_app(A,B)
              ...: %timeit outer_sum_dot_app(A,B)
              ...: %timeit einsum_all_app(A,B)
              ...: %timeit outer_einsum_dot_app(A,B)
              ...:
              10 loops, best of 3: 136 ms per loop
              1 loops, best of 3: 302 ms per loop
              100 loops, best of 3: 8.51 ms per loop
              1 loops, best of 3: 341 ms per loop
              100 loops, best of 3: 8.38 ms per loop






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Mar 24 '17 at 8:41

























              answered Mar 24 '17 at 8:27









              DivakarDivakar

              158k1490183




              158k1490183













              • Ha! I beat Divakar to the einsum answer! Of course mine seems to be the wrong way to use it if you want faster solution . . .

                – Daniel F
                Mar 24 '17 at 8:32













              • @DanielForsman You just need more practice, you will get there eventually! :)

                – Divakar
                Mar 24 '17 at 8:33











              • Considering how much of my current pile of code relies on computing Cartesian distances quickly, that algebra trick is useful enough that I don't much mind :)

                – Daniel F
                Mar 24 '17 at 8:36



















              • Ha! I beat Divakar to the einsum answer! Of course mine seems to be the wrong way to use it if you want faster solution . . .

                – Daniel F
                Mar 24 '17 at 8:32













              • @DanielForsman You just need more practice, you will get there eventually! :)

                – Divakar
                Mar 24 '17 at 8:33











              • Considering how much of my current pile of code relies on computing Cartesian distances quickly, that algebra trick is useful enough that I don't much mind :)

                – Daniel F
                Mar 24 '17 at 8:36

















              Ha! I beat Divakar to the einsum answer! Of course mine seems to be the wrong way to use it if you want faster solution . . .

              – Daniel F
              Mar 24 '17 at 8:32







              Ha! I beat Divakar to the einsum answer! Of course mine seems to be the wrong way to use it if you want faster solution . . .

              – Daniel F
              Mar 24 '17 at 8:32















              @DanielForsman You just need more practice, you will get there eventually! :)

              – Divakar
              Mar 24 '17 at 8:33





              @DanielForsman You just need more practice, you will get there eventually! :)

              – Divakar
              Mar 24 '17 at 8:33













              Considering how much of my current pile of code relies on computing Cartesian distances quickly, that algebra trick is useful enough that I don't much mind :)

              – Daniel F
              Mar 24 '17 at 8:36





              Considering how much of my current pile of code relies on computing Cartesian distances quickly, that algebra trick is useful enough that I don't much mind :)

              – Daniel F
              Mar 24 '17 at 8:36













              2














              Here is a solution which avoids both the loop and the large intermediates:



              from numpy import *
              import time

              A = random.randn(1000,100)
              B = random.randn(1000,100)

              start = time.time()
              for a in A:
              sum((a - B)**2,1)
              print time.time() - start

              # pure broadcasting
              start = time.time()
              ((A[:,newaxis,:] - B)**2).sum(-1)
              print time.time() - start

              #matmul
              start = time.time()
              add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*dot(A,B.T)
              print time.time() - start


              Prints:



              0.546781778336
              0.674743175507
              0.10723400116





              share|improve this answer




























                2














                Here is a solution which avoids both the loop and the large intermediates:



                from numpy import *
                import time

                A = random.randn(1000,100)
                B = random.randn(1000,100)

                start = time.time()
                for a in A:
                sum((a - B)**2,1)
                print time.time() - start

                # pure broadcasting
                start = time.time()
                ((A[:,newaxis,:] - B)**2).sum(-1)
                print time.time() - start

                #matmul
                start = time.time()
                add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*dot(A,B.T)
                print time.time() - start


                Prints:



                0.546781778336
                0.674743175507
                0.10723400116





                share|improve this answer


























                  2












                  2








                  2







                  Here is a solution which avoids both the loop and the large intermediates:



                  from numpy import *
                  import time

                  A = random.randn(1000,100)
                  B = random.randn(1000,100)

                  start = time.time()
                  for a in A:
                  sum((a - B)**2,1)
                  print time.time() - start

                  # pure broadcasting
                  start = time.time()
                  ((A[:,newaxis,:] - B)**2).sum(-1)
                  print time.time() - start

                  #matmul
                  start = time.time()
                  add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*dot(A,B.T)
                  print time.time() - start


                  Prints:



                  0.546781778336
                  0.674743175507
                  0.10723400116





                  share|improve this answer













                  Here is a solution which avoids both the loop and the large intermediates:



                  from numpy import *
                  import time

                  A = random.randn(1000,100)
                  B = random.randn(1000,100)

                  start = time.time()
                  for a in A:
                  sum((a - B)**2,1)
                  print time.time() - start

                  # pure broadcasting
                  start = time.time()
                  ((A[:,newaxis,:] - B)**2).sum(-1)
                  print time.time() - start

                  #matmul
                  start = time.time()
                  add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*dot(A,B.T)
                  print time.time() - start


                  Prints:



                  0.546781778336
                  0.674743175507
                  0.10723400116






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Mar 24 '17 at 4:44









                  Paul PanzerPaul Panzer

                  30.8k21845




                  30.8k21845























                      2














                      Another job for np.einsum



                      np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:], A[:,None,:] - B[None,:,:])





                      share|improve this answer




























                        2














                        Another job for np.einsum



                        np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:], A[:,None,:] - B[None,:,:])





                        share|improve this answer


























                          2












                          2








                          2







                          Another job for np.einsum



                          np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:], A[:,None,:] - B[None,:,:])





                          share|improve this answer













                          Another job for np.einsum



                          np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:], A[:,None,:] - B[None,:,:])






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Mar 24 '17 at 8:25









                          Daniel FDaniel F

                          7,4071432




                          7,4071432






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f42991347%2fhow-to-find-the-pairwise-differences-between-rows-of-two-very-large-matrices-usi%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Xamarin.iOS Cant Deploy on Iphone

                              Glorious Revolution

                              Dulmage-Mendelsohn matrix decomposition in Python