Numpy Aggregate Rows and Sum











up vote
1
down vote

favorite
2












I have a Numpy matrix:



M = [[55, 5],
[56, 3],
[57, 7],
[58, 9],
[59, 3],
[60, 8],
[61, 1]]


I want to aggregate by group_size (for example into 3 groups):



group_size = math.ceil(M.size/groups) # math.ceil(7/3) = 3


Each aggregated row has a left value being the first left value from the group, and the right value being the sum of all right values from the group.



Expected output:



R = [[55, 15], # 55 first left column value of first group, 15 sum of all right values in group 
[58, 20], # 58 first left column value of second group, 20 sum of all right values in group
[61, 1]] # Third group consist only of one row, remainder


Is there an efficient way to solve this with Numpy without looping?










share|improve this question


























    up vote
    1
    down vote

    favorite
    2












    I have a Numpy matrix:



    M = [[55, 5],
    [56, 3],
    [57, 7],
    [58, 9],
    [59, 3],
    [60, 8],
    [61, 1]]


    I want to aggregate by group_size (for example into 3 groups):



    group_size = math.ceil(M.size/groups) # math.ceil(7/3) = 3


    Each aggregated row has a left value being the first left value from the group, and the right value being the sum of all right values from the group.



    Expected output:



    R = [[55, 15], # 55 first left column value of first group, 15 sum of all right values in group 
    [58, 20], # 58 first left column value of second group, 20 sum of all right values in group
    [61, 1]] # Third group consist only of one row, remainder


    Is there an efficient way to solve this with Numpy without looping?










    share|improve this question
























      up vote
      1
      down vote

      favorite
      2









      up vote
      1
      down vote

      favorite
      2






      2





      I have a Numpy matrix:



      M = [[55, 5],
      [56, 3],
      [57, 7],
      [58, 9],
      [59, 3],
      [60, 8],
      [61, 1]]


      I want to aggregate by group_size (for example into 3 groups):



      group_size = math.ceil(M.size/groups) # math.ceil(7/3) = 3


      Each aggregated row has a left value being the first left value from the group, and the right value being the sum of all right values from the group.



      Expected output:



      R = [[55, 15], # 55 first left column value of first group, 15 sum of all right values in group 
      [58, 20], # 58 first left column value of second group, 20 sum of all right values in group
      [61, 1]] # Third group consist only of one row, remainder


      Is there an efficient way to solve this with Numpy without looping?










      share|improve this question













      I have a Numpy matrix:



      M = [[55, 5],
      [56, 3],
      [57, 7],
      [58, 9],
      [59, 3],
      [60, 8],
      [61, 1]]


      I want to aggregate by group_size (for example into 3 groups):



      group_size = math.ceil(M.size/groups) # math.ceil(7/3) = 3


      Each aggregated row has a left value being the first left value from the group, and the right value being the sum of all right values from the group.



      Expected output:



      R = [[55, 15], # 55 first left column value of first group, 15 sum of all right values in group 
      [58, 20], # 58 first left column value of second group, 20 sum of all right values in group
      [61, 1]] # Third group consist only of one row, remainder


      Is there an efficient way to solve this with Numpy without looping?







      python pandas numpy






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 2 days ago









      Franc Weser

      1527




      1527
























          4 Answers
          4






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          Here's one way with NumPy:



          n = 3
          x = M[::n, 0]
          y = np.add.reduceat(M[:, 1], np.arange(0, M.shape[0], n))

          R = np.vstack((x, y)).T

          print(R)

          array([[55, 15],
          [58, 20],
          [61, 1]])





          share|improve this answer




























            up vote
            3
            down vote













            pandas solution should be use agg with first and sum:



            group_size = 3
            df = pd.DataFrame(M).groupby(np.arange(len(M)) // group_size).agg({0:'first',1:'sum'})
            print (df)
            0 1
            0 55 15
            1 58 20
            2 61 1

            a = np.array(df.values.tolist())
            print(a)
            [[55 15]
            [58 20]
            [61 1]]





            share|improve this answer




























              up vote
              1
              down vote













              A solution using Python:



              from operator import itemgetter

              M = [[55, 5],
              [56, 3],
              [57, 7],
              [58, 9],
              [59, 3],
              [60, 8],
              [61, 1]]
              it = (M[e:e+3] for e in range(0, len(M), 3))
              print([[e[0][0], sum(map(itemgetter(1), e))] for e in it])


              Output



              [[55, 15], [58, 20], [61, 1]]





              share|improve this answer




























                up vote
                0
                down vote













                a = np.array([[2, 3],[5, 6],[7, 9]])
                b = numpy.zeros(shape=(len(a[0])))
                for i in a:
                b=b+i
                print(b)





                share|improve this answer

















                • 2




                  An explanation, what a code does and how this addresses the problem in the question, rarely fails to improve an answer.
                  – blue-phoenox
                  2 days ago











                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                 

                draft saved


                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239224%2fnumpy-aggregate-rows-and-sum%23new-answer', 'question_page');
                }
                );

                Post as a guest
































                4 Answers
                4






                active

                oldest

                votes








                4 Answers
                4






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes








                up vote
                3
                down vote



                accepted










                Here's one way with NumPy:



                n = 3
                x = M[::n, 0]
                y = np.add.reduceat(M[:, 1], np.arange(0, M.shape[0], n))

                R = np.vstack((x, y)).T

                print(R)

                array([[55, 15],
                [58, 20],
                [61, 1]])





                share|improve this answer

























                  up vote
                  3
                  down vote



                  accepted










                  Here's one way with NumPy:



                  n = 3
                  x = M[::n, 0]
                  y = np.add.reduceat(M[:, 1], np.arange(0, M.shape[0], n))

                  R = np.vstack((x, y)).T

                  print(R)

                  array([[55, 15],
                  [58, 20],
                  [61, 1]])





                  share|improve this answer























                    up vote
                    3
                    down vote



                    accepted







                    up vote
                    3
                    down vote



                    accepted






                    Here's one way with NumPy:



                    n = 3
                    x = M[::n, 0]
                    y = np.add.reduceat(M[:, 1], np.arange(0, M.shape[0], n))

                    R = np.vstack((x, y)).T

                    print(R)

                    array([[55, 15],
                    [58, 20],
                    [61, 1]])





                    share|improve this answer












                    Here's one way with NumPy:



                    n = 3
                    x = M[::n, 0]
                    y = np.add.reduceat(M[:, 1], np.arange(0, M.shape[0], n))

                    R = np.vstack((x, y)).T

                    print(R)

                    array([[55, 15],
                    [58, 20],
                    [61, 1]])






                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered 2 days ago









                    jpp

                    80.2k184695




                    80.2k184695
























                        up vote
                        3
                        down vote













                        pandas solution should be use agg with first and sum:



                        group_size = 3
                        df = pd.DataFrame(M).groupby(np.arange(len(M)) // group_size).agg({0:'first',1:'sum'})
                        print (df)
                        0 1
                        0 55 15
                        1 58 20
                        2 61 1

                        a = np.array(df.values.tolist())
                        print(a)
                        [[55 15]
                        [58 20]
                        [61 1]]





                        share|improve this answer

























                          up vote
                          3
                          down vote













                          pandas solution should be use agg with first and sum:



                          group_size = 3
                          df = pd.DataFrame(M).groupby(np.arange(len(M)) // group_size).agg({0:'first',1:'sum'})
                          print (df)
                          0 1
                          0 55 15
                          1 58 20
                          2 61 1

                          a = np.array(df.values.tolist())
                          print(a)
                          [[55 15]
                          [58 20]
                          [61 1]]





                          share|improve this answer























                            up vote
                            3
                            down vote










                            up vote
                            3
                            down vote









                            pandas solution should be use agg with first and sum:



                            group_size = 3
                            df = pd.DataFrame(M).groupby(np.arange(len(M)) // group_size).agg({0:'first',1:'sum'})
                            print (df)
                            0 1
                            0 55 15
                            1 58 20
                            2 61 1

                            a = np.array(df.values.tolist())
                            print(a)
                            [[55 15]
                            [58 20]
                            [61 1]]





                            share|improve this answer












                            pandas solution should be use agg with first and sum:



                            group_size = 3
                            df = pd.DataFrame(M).groupby(np.arange(len(M)) // group_size).agg({0:'first',1:'sum'})
                            print (df)
                            0 1
                            0 55 15
                            1 58 20
                            2 61 1

                            a = np.array(df.values.tolist())
                            print(a)
                            [[55 15]
                            [58 20]
                            [61 1]]






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered 2 days ago









                            jezrael

                            305k20238314




                            305k20238314






















                                up vote
                                1
                                down vote













                                A solution using Python:



                                from operator import itemgetter

                                M = [[55, 5],
                                [56, 3],
                                [57, 7],
                                [58, 9],
                                [59, 3],
                                [60, 8],
                                [61, 1]]
                                it = (M[e:e+3] for e in range(0, len(M), 3))
                                print([[e[0][0], sum(map(itemgetter(1), e))] for e in it])


                                Output



                                [[55, 15], [58, 20], [61, 1]]





                                share|improve this answer

























                                  up vote
                                  1
                                  down vote













                                  A solution using Python:



                                  from operator import itemgetter

                                  M = [[55, 5],
                                  [56, 3],
                                  [57, 7],
                                  [58, 9],
                                  [59, 3],
                                  [60, 8],
                                  [61, 1]]
                                  it = (M[e:e+3] for e in range(0, len(M), 3))
                                  print([[e[0][0], sum(map(itemgetter(1), e))] for e in it])


                                  Output



                                  [[55, 15], [58, 20], [61, 1]]





                                  share|improve this answer























                                    up vote
                                    1
                                    down vote










                                    up vote
                                    1
                                    down vote









                                    A solution using Python:



                                    from operator import itemgetter

                                    M = [[55, 5],
                                    [56, 3],
                                    [57, 7],
                                    [58, 9],
                                    [59, 3],
                                    [60, 8],
                                    [61, 1]]
                                    it = (M[e:e+3] for e in range(0, len(M), 3))
                                    print([[e[0][0], sum(map(itemgetter(1), e))] for e in it])


                                    Output



                                    [[55, 15], [58, 20], [61, 1]]





                                    share|improve this answer












                                    A solution using Python:



                                    from operator import itemgetter

                                    M = [[55, 5],
                                    [56, 3],
                                    [57, 7],
                                    [58, 9],
                                    [59, 3],
                                    [60, 8],
                                    [61, 1]]
                                    it = (M[e:e+3] for e in range(0, len(M), 3))
                                    print([[e[0][0], sum(map(itemgetter(1), e))] for e in it])


                                    Output



                                    [[55, 15], [58, 20], [61, 1]]






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered 2 days ago









                                    Daniel Mesejo

                                    7,6341821




                                    7,6341821






















                                        up vote
                                        0
                                        down vote













                                        a = np.array([[2, 3],[5, 6],[7, 9]])
                                        b = numpy.zeros(shape=(len(a[0])))
                                        for i in a:
                                        b=b+i
                                        print(b)





                                        share|improve this answer

















                                        • 2




                                          An explanation, what a code does and how this addresses the problem in the question, rarely fails to improve an answer.
                                          – blue-phoenox
                                          2 days ago















                                        up vote
                                        0
                                        down vote













                                        a = np.array([[2, 3],[5, 6],[7, 9]])
                                        b = numpy.zeros(shape=(len(a[0])))
                                        for i in a:
                                        b=b+i
                                        print(b)





                                        share|improve this answer

















                                        • 2




                                          An explanation, what a code does and how this addresses the problem in the question, rarely fails to improve an answer.
                                          – blue-phoenox
                                          2 days ago













                                        up vote
                                        0
                                        down vote










                                        up vote
                                        0
                                        down vote









                                        a = np.array([[2, 3],[5, 6],[7, 9]])
                                        b = numpy.zeros(shape=(len(a[0])))
                                        for i in a:
                                        b=b+i
                                        print(b)





                                        share|improve this answer












                                        a = np.array([[2, 3],[5, 6],[7, 9]])
                                        b = numpy.zeros(shape=(len(a[0])))
                                        for i in a:
                                        b=b+i
                                        print(b)






                                        share|improve this answer












                                        share|improve this answer



                                        share|improve this answer










                                        answered 2 days ago









                                        Mohammad reza Kashi

                                        214




                                        214








                                        • 2




                                          An explanation, what a code does and how this addresses the problem in the question, rarely fails to improve an answer.
                                          – blue-phoenox
                                          2 days ago














                                        • 2




                                          An explanation, what a code does and how this addresses the problem in the question, rarely fails to improve an answer.
                                          – blue-phoenox
                                          2 days ago








                                        2




                                        2




                                        An explanation, what a code does and how this addresses the problem in the question, rarely fails to improve an answer.
                                        – blue-phoenox
                                        2 days ago




                                        An explanation, what a code does and how this addresses the problem in the question, rarely fails to improve an answer.
                                        – blue-phoenox
                                        2 days ago


















                                         

                                        draft saved


                                        draft discarded



















































                                         


                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239224%2fnumpy-aggregate-rows-and-sum%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest




















































































                                        Popular posts from this blog

                                        Xamarin.iOS Cant Deploy on Iphone

                                        Glorious Revolution

                                        Dulmage-Mendelsohn matrix decomposition in Python