in R, perform a function over rows in a data frame and join the result together in a new date frame












1















I feel I'm in the realm of the right answer but for the life of me I can't figure out how to do this. I want to take an entry on a product ledger and change the shape of the data where I have a row/case for every life month of a product transaction.



As a note, sometimes these ledger entries are reversed/refunded (and have date_from > date_thru). I have an example of a new purchase and a refund in the data frame below. Also, transaction dates are pre-cleaned to always be the first of the month in which they were made, since I only care about monthly recurring revenue (mrr) on a month-by-month basis.



Sample df:



user <- c("1001", "1002")
line_item <- c("abc123", "def456")
date_from <- as.Date(c("2015-01-01", "2015-06-01"), "%Y-%m-%d")
date_thru <- as.Date(c("2015-04-01", "2014-12-01"), "%Y-%m-%d")
mrr <- c(2.22, -4.44)
df <- cbind.data.frame(user, line_item, date_from, date_thru, mrr)


Output:



  user line_item  date_from  date_thru   mrr
1 1001 abc123 2015-01-01 2015-04-01 2.22
2 1002 def456 2015-06-01 2014-12-01 -4.44


Desired result:



user    line_item       month           mrr
1001 abc123 2015-01-01 2.22
1001 abc123 2015-02-01 2.22
1001 abc123 2015-03-01 2.22
1001 abc123 2015-04-01 2.22
1002 def456 2015-06-01 -4.44
1002 def456 2015-05-01 -4.44
1002 def456 2015-04-01 -4.44
1002 def456 2015-02-01 -4.44
1002 def456 2015-01-01 -4.44
1002 def456 2014-12-01 -4.44


How can I create a new month column by vectorizing a function like seq(date_from, date_thru, by="months"), and then joining all the resulting vectors into one df again like above?



I've been trying lapply, dplyr::mutate, and seq and can't get them to work correctly together.



Thanks in advance!










share|improve this question























  • @Shree yes that is correct, it is an example of a reversed line_item. A reversed lineitem has a negative value and will have a date_thru that is before a date_from (sort of like a negative date range).

    – Adam Ellsworth
    Nov 14 '18 at 2:20











  • I think you are missing date 2015-03-01 for user 1002 in your desired output

    – Shree
    Nov 14 '18 at 2:58











  • You should redefine the seq.Date fucntion so it accepts positive and negative sequences gracefully.

    – 42-
    Nov 14 '18 at 3:31
















1















I feel I'm in the realm of the right answer but for the life of me I can't figure out how to do this. I want to take an entry on a product ledger and change the shape of the data where I have a row/case for every life month of a product transaction.



As a note, sometimes these ledger entries are reversed/refunded (and have date_from > date_thru). I have an example of a new purchase and a refund in the data frame below. Also, transaction dates are pre-cleaned to always be the first of the month in which they were made, since I only care about monthly recurring revenue (mrr) on a month-by-month basis.



Sample df:



user <- c("1001", "1002")
line_item <- c("abc123", "def456")
date_from <- as.Date(c("2015-01-01", "2015-06-01"), "%Y-%m-%d")
date_thru <- as.Date(c("2015-04-01", "2014-12-01"), "%Y-%m-%d")
mrr <- c(2.22, -4.44)
df <- cbind.data.frame(user, line_item, date_from, date_thru, mrr)


Output:



  user line_item  date_from  date_thru   mrr
1 1001 abc123 2015-01-01 2015-04-01 2.22
2 1002 def456 2015-06-01 2014-12-01 -4.44


Desired result:



user    line_item       month           mrr
1001 abc123 2015-01-01 2.22
1001 abc123 2015-02-01 2.22
1001 abc123 2015-03-01 2.22
1001 abc123 2015-04-01 2.22
1002 def456 2015-06-01 -4.44
1002 def456 2015-05-01 -4.44
1002 def456 2015-04-01 -4.44
1002 def456 2015-02-01 -4.44
1002 def456 2015-01-01 -4.44
1002 def456 2014-12-01 -4.44


How can I create a new month column by vectorizing a function like seq(date_from, date_thru, by="months"), and then joining all the resulting vectors into one df again like above?



I've been trying lapply, dplyr::mutate, and seq and can't get them to work correctly together.



Thanks in advance!










share|improve this question























  • @Shree yes that is correct, it is an example of a reversed line_item. A reversed lineitem has a negative value and will have a date_thru that is before a date_from (sort of like a negative date range).

    – Adam Ellsworth
    Nov 14 '18 at 2:20











  • I think you are missing date 2015-03-01 for user 1002 in your desired output

    – Shree
    Nov 14 '18 at 2:58











  • You should redefine the seq.Date fucntion so it accepts positive and negative sequences gracefully.

    – 42-
    Nov 14 '18 at 3:31














1












1








1








I feel I'm in the realm of the right answer but for the life of me I can't figure out how to do this. I want to take an entry on a product ledger and change the shape of the data where I have a row/case for every life month of a product transaction.



As a note, sometimes these ledger entries are reversed/refunded (and have date_from > date_thru). I have an example of a new purchase and a refund in the data frame below. Also, transaction dates are pre-cleaned to always be the first of the month in which they were made, since I only care about monthly recurring revenue (mrr) on a month-by-month basis.



Sample df:



user <- c("1001", "1002")
line_item <- c("abc123", "def456")
date_from <- as.Date(c("2015-01-01", "2015-06-01"), "%Y-%m-%d")
date_thru <- as.Date(c("2015-04-01", "2014-12-01"), "%Y-%m-%d")
mrr <- c(2.22, -4.44)
df <- cbind.data.frame(user, line_item, date_from, date_thru, mrr)


Output:



  user line_item  date_from  date_thru   mrr
1 1001 abc123 2015-01-01 2015-04-01 2.22
2 1002 def456 2015-06-01 2014-12-01 -4.44


Desired result:



user    line_item       month           mrr
1001 abc123 2015-01-01 2.22
1001 abc123 2015-02-01 2.22
1001 abc123 2015-03-01 2.22
1001 abc123 2015-04-01 2.22
1002 def456 2015-06-01 -4.44
1002 def456 2015-05-01 -4.44
1002 def456 2015-04-01 -4.44
1002 def456 2015-02-01 -4.44
1002 def456 2015-01-01 -4.44
1002 def456 2014-12-01 -4.44


How can I create a new month column by vectorizing a function like seq(date_from, date_thru, by="months"), and then joining all the resulting vectors into one df again like above?



I've been trying lapply, dplyr::mutate, and seq and can't get them to work correctly together.



Thanks in advance!










share|improve this question














I feel I'm in the realm of the right answer but for the life of me I can't figure out how to do this. I want to take an entry on a product ledger and change the shape of the data where I have a row/case for every life month of a product transaction.



As a note, sometimes these ledger entries are reversed/refunded (and have date_from > date_thru). I have an example of a new purchase and a refund in the data frame below. Also, transaction dates are pre-cleaned to always be the first of the month in which they were made, since I only care about monthly recurring revenue (mrr) on a month-by-month basis.



Sample df:



user <- c("1001", "1002")
line_item <- c("abc123", "def456")
date_from <- as.Date(c("2015-01-01", "2015-06-01"), "%Y-%m-%d")
date_thru <- as.Date(c("2015-04-01", "2014-12-01"), "%Y-%m-%d")
mrr <- c(2.22, -4.44)
df <- cbind.data.frame(user, line_item, date_from, date_thru, mrr)


Output:



  user line_item  date_from  date_thru   mrr
1 1001 abc123 2015-01-01 2015-04-01 2.22
2 1002 def456 2015-06-01 2014-12-01 -4.44


Desired result:



user    line_item       month           mrr
1001 abc123 2015-01-01 2.22
1001 abc123 2015-02-01 2.22
1001 abc123 2015-03-01 2.22
1001 abc123 2015-04-01 2.22
1002 def456 2015-06-01 -4.44
1002 def456 2015-05-01 -4.44
1002 def456 2015-04-01 -4.44
1002 def456 2015-02-01 -4.44
1002 def456 2015-01-01 -4.44
1002 def456 2014-12-01 -4.44


How can I create a new month column by vectorizing a function like seq(date_from, date_thru, by="months"), and then joining all the resulting vectors into one df again like above?



I've been trying lapply, dplyr::mutate, and seq and can't get them to work correctly together.



Thanks in advance!







r dplyr






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 14 '18 at 1:48









Adam EllsworthAdam Ellsworth

84




84













  • @Shree yes that is correct, it is an example of a reversed line_item. A reversed lineitem has a negative value and will have a date_thru that is before a date_from (sort of like a negative date range).

    – Adam Ellsworth
    Nov 14 '18 at 2:20











  • I think you are missing date 2015-03-01 for user 1002 in your desired output

    – Shree
    Nov 14 '18 at 2:58











  • You should redefine the seq.Date fucntion so it accepts positive and negative sequences gracefully.

    – 42-
    Nov 14 '18 at 3:31



















  • @Shree yes that is correct, it is an example of a reversed line_item. A reversed lineitem has a negative value and will have a date_thru that is before a date_from (sort of like a negative date range).

    – Adam Ellsworth
    Nov 14 '18 at 2:20











  • I think you are missing date 2015-03-01 for user 1002 in your desired output

    – Shree
    Nov 14 '18 at 2:58











  • You should redefine the seq.Date fucntion so it accepts positive and negative sequences gracefully.

    – 42-
    Nov 14 '18 at 3:31

















@Shree yes that is correct, it is an example of a reversed line_item. A reversed lineitem has a negative value and will have a date_thru that is before a date_from (sort of like a negative date range).

– Adam Ellsworth
Nov 14 '18 at 2:20





@Shree yes that is correct, it is an example of a reversed line_item. A reversed lineitem has a negative value and will have a date_thru that is before a date_from (sort of like a negative date range).

– Adam Ellsworth
Nov 14 '18 at 2:20













I think you are missing date 2015-03-01 for user 1002 in your desired output

– Shree
Nov 14 '18 at 2:58





I think you are missing date 2015-03-01 for user 1002 in your desired output

– Shree
Nov 14 '18 at 2:58













You should redefine the seq.Date fucntion so it accepts positive and negative sequences gracefully.

– 42-
Nov 14 '18 at 3:31





You should redefine the seq.Date fucntion so it accepts positive and negative sequences gracefully.

– 42-
Nov 14 '18 at 3:31












3 Answers
3






active

oldest

votes


















0














Just to uncover the special powers of tidyr with the complete and fill functions



library(tidyverse)

df %>%
# turn into a long format to handle the dates as one variable
gather(key, date, date_from:date_thru) %>%
# group the dataframe to apply operations separately on each user
group_by(user) %>%
# now complete the dataset with the implicit dates within the range of dates
# Note the consideration of: date_from > date_to with 'min' and 'max'
complete(date = seq.Date(from = min(date),to = max(date),by = "month") ) %>%
#now fill in the missing 'mrr'
fill(mrr, line_item)


# A tibble: 11 x 5
# Groups: user [2]
user date line_item mrr key
<fct> <date> <fct> <dbl> <chr>
1 1001 2015-01-01 abc123 2.22 date_from
2 1001 2015-02-01 abc123 2.22 <NA>
3 1001 2015-03-01 abc123 2.22 <NA>
4 1001 2015-04-01 abc123 2.22 date_thru
5 1002 2014-12-01 def456 -4.44 date_thru
6 1002 2015-01-01 def456 -4.44 <NA>
7 1002 2015-02-01 def456 -4.44 <NA>
8 1002 2015-03-01 def456 -4.44 <NA>
9 1002 2015-04-01 def456 -4.44 <NA>
10 1002 2015-05-01 def456 -4.44 <NA>
11 1002 2015-06-01 def456 -4.44 date_from


If you want to remove the keyvariable add %>% select(-key) following after the end of the fill although you might want to keep it for some reason...



P.S. I really liked the clever trick in the function creation by @G. Grothendieck with the sign to handle date_from > date_to






share|improve this answer































    2














    First define a month_seq function which produces the desired date sequence given the from and to dates. Then Map it to the date_from and date_thru columns producing a column of type list whose components are the date sequences. Then unnest the month and select the desired columns.



    library(dplyr)
    library(tidyr)

    month_seq <- function(from, to) seq(from, to, paste(sign(to - from), "month"))

    df %>%
    mutate(month = Map(month_seq, date_from, date_thru)) %>%
    unnest %>%
    select(user, line_item, month, mrr)


    giving:



       user line_item      month   mrr
    1 1001 abc123 2015-01-01 2.22
    2 1001 abc123 2015-02-01 2.22
    3 1001 abc123 2015-03-01 2.22
    4 1001 abc123 2015-04-01 2.22
    5 1002 def456 2015-06-01 -4.44
    6 1002 def456 2015-05-01 -4.44
    7 1002 def456 2015-04-01 -4.44
    8 1002 def456 2015-03-01 -4.44
    9 1002 def456 2015-02-01 -4.44
    10 1002 def456 2015-01-01 -4.44
    11 1002 def456 2014-12-01 -4.44





    share|improve this answer

































      0














      Here's a solution using dplyr, tidyr, and lubridate. I found the elapsed_months function on this SO post Number of months between two dates. I have modified it a bit for your case.



      Also, I am assuming negative mrr indicates that date_from > date_thru.



      elapsed_months <- function(end_date, start_date) {
      ed <- as.POSIXlt(end_date)
      sd <- as.POSIXlt(start_date)
      12 * (ed$year - sd$year) + (ed$mon - sd$mon) + ifelse(ed >= sd, 1, -1)
      }

      df %>%
      uncount(weights = abs(elapsed_months(date_thru, date_from)), .id = "Months") %>%
      mutate(
      Month = date_from %m+% months(sign(mrr)*(Months - 1))
      ) %>%
      select(user, line_item, Month, mrr)

      user line_item Month mrr
      1 1001 abc123 2015-01-01 2.22
      2 1001 abc123 2015-02-01 2.22
      3 1001 abc123 2015-03-01 2.22
      4 1001 abc123 2015-04-01 2.22
      5 1002 def456 2015-06-01 -4.44
      6 1002 def456 2015-05-01 -4.44
      7 1002 def456 2015-04-01 -4.44
      8 1002 def456 2015-03-01 -4.44
      9 1002 def456 2015-02-01 -4.44
      10 1002 def456 2015-01-01 -4.44
      11 1002 def456 2014-12-01 -4.44





      share|improve this answer

























        Your Answer






        StackExchange.ifUsing("editor", function () {
        StackExchange.using("externalEditor", function () {
        StackExchange.using("snippets", function () {
        StackExchange.snippets.init();
        });
        });
        }, "code-snippets");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "1"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292044%2fin-r-perform-a-function-over-rows-in-a-data-frame-and-join-the-result-together%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        0














        Just to uncover the special powers of tidyr with the complete and fill functions



        library(tidyverse)

        df %>%
        # turn into a long format to handle the dates as one variable
        gather(key, date, date_from:date_thru) %>%
        # group the dataframe to apply operations separately on each user
        group_by(user) %>%
        # now complete the dataset with the implicit dates within the range of dates
        # Note the consideration of: date_from > date_to with 'min' and 'max'
        complete(date = seq.Date(from = min(date),to = max(date),by = "month") ) %>%
        #now fill in the missing 'mrr'
        fill(mrr, line_item)


        # A tibble: 11 x 5
        # Groups: user [2]
        user date line_item mrr key
        <fct> <date> <fct> <dbl> <chr>
        1 1001 2015-01-01 abc123 2.22 date_from
        2 1001 2015-02-01 abc123 2.22 <NA>
        3 1001 2015-03-01 abc123 2.22 <NA>
        4 1001 2015-04-01 abc123 2.22 date_thru
        5 1002 2014-12-01 def456 -4.44 date_thru
        6 1002 2015-01-01 def456 -4.44 <NA>
        7 1002 2015-02-01 def456 -4.44 <NA>
        8 1002 2015-03-01 def456 -4.44 <NA>
        9 1002 2015-04-01 def456 -4.44 <NA>
        10 1002 2015-05-01 def456 -4.44 <NA>
        11 1002 2015-06-01 def456 -4.44 date_from


        If you want to remove the keyvariable add %>% select(-key) following after the end of the fill although you might want to keep it for some reason...



        P.S. I really liked the clever trick in the function creation by @G. Grothendieck with the sign to handle date_from > date_to






        share|improve this answer




























          0














          Just to uncover the special powers of tidyr with the complete and fill functions



          library(tidyverse)

          df %>%
          # turn into a long format to handle the dates as one variable
          gather(key, date, date_from:date_thru) %>%
          # group the dataframe to apply operations separately on each user
          group_by(user) %>%
          # now complete the dataset with the implicit dates within the range of dates
          # Note the consideration of: date_from > date_to with 'min' and 'max'
          complete(date = seq.Date(from = min(date),to = max(date),by = "month") ) %>%
          #now fill in the missing 'mrr'
          fill(mrr, line_item)


          # A tibble: 11 x 5
          # Groups: user [2]
          user date line_item mrr key
          <fct> <date> <fct> <dbl> <chr>
          1 1001 2015-01-01 abc123 2.22 date_from
          2 1001 2015-02-01 abc123 2.22 <NA>
          3 1001 2015-03-01 abc123 2.22 <NA>
          4 1001 2015-04-01 abc123 2.22 date_thru
          5 1002 2014-12-01 def456 -4.44 date_thru
          6 1002 2015-01-01 def456 -4.44 <NA>
          7 1002 2015-02-01 def456 -4.44 <NA>
          8 1002 2015-03-01 def456 -4.44 <NA>
          9 1002 2015-04-01 def456 -4.44 <NA>
          10 1002 2015-05-01 def456 -4.44 <NA>
          11 1002 2015-06-01 def456 -4.44 date_from


          If you want to remove the keyvariable add %>% select(-key) following after the end of the fill although you might want to keep it for some reason...



          P.S. I really liked the clever trick in the function creation by @G. Grothendieck with the sign to handle date_from > date_to






          share|improve this answer


























            0












            0








            0







            Just to uncover the special powers of tidyr with the complete and fill functions



            library(tidyverse)

            df %>%
            # turn into a long format to handle the dates as one variable
            gather(key, date, date_from:date_thru) %>%
            # group the dataframe to apply operations separately on each user
            group_by(user) %>%
            # now complete the dataset with the implicit dates within the range of dates
            # Note the consideration of: date_from > date_to with 'min' and 'max'
            complete(date = seq.Date(from = min(date),to = max(date),by = "month") ) %>%
            #now fill in the missing 'mrr'
            fill(mrr, line_item)


            # A tibble: 11 x 5
            # Groups: user [2]
            user date line_item mrr key
            <fct> <date> <fct> <dbl> <chr>
            1 1001 2015-01-01 abc123 2.22 date_from
            2 1001 2015-02-01 abc123 2.22 <NA>
            3 1001 2015-03-01 abc123 2.22 <NA>
            4 1001 2015-04-01 abc123 2.22 date_thru
            5 1002 2014-12-01 def456 -4.44 date_thru
            6 1002 2015-01-01 def456 -4.44 <NA>
            7 1002 2015-02-01 def456 -4.44 <NA>
            8 1002 2015-03-01 def456 -4.44 <NA>
            9 1002 2015-04-01 def456 -4.44 <NA>
            10 1002 2015-05-01 def456 -4.44 <NA>
            11 1002 2015-06-01 def456 -4.44 date_from


            If you want to remove the keyvariable add %>% select(-key) following after the end of the fill although you might want to keep it for some reason...



            P.S. I really liked the clever trick in the function creation by @G. Grothendieck with the sign to handle date_from > date_to






            share|improve this answer













            Just to uncover the special powers of tidyr with the complete and fill functions



            library(tidyverse)

            df %>%
            # turn into a long format to handle the dates as one variable
            gather(key, date, date_from:date_thru) %>%
            # group the dataframe to apply operations separately on each user
            group_by(user) %>%
            # now complete the dataset with the implicit dates within the range of dates
            # Note the consideration of: date_from > date_to with 'min' and 'max'
            complete(date = seq.Date(from = min(date),to = max(date),by = "month") ) %>%
            #now fill in the missing 'mrr'
            fill(mrr, line_item)


            # A tibble: 11 x 5
            # Groups: user [2]
            user date line_item mrr key
            <fct> <date> <fct> <dbl> <chr>
            1 1001 2015-01-01 abc123 2.22 date_from
            2 1001 2015-02-01 abc123 2.22 <NA>
            3 1001 2015-03-01 abc123 2.22 <NA>
            4 1001 2015-04-01 abc123 2.22 date_thru
            5 1002 2014-12-01 def456 -4.44 date_thru
            6 1002 2015-01-01 def456 -4.44 <NA>
            7 1002 2015-02-01 def456 -4.44 <NA>
            8 1002 2015-03-01 def456 -4.44 <NA>
            9 1002 2015-04-01 def456 -4.44 <NA>
            10 1002 2015-05-01 def456 -4.44 <NA>
            11 1002 2015-06-01 def456 -4.44 date_from


            If you want to remove the keyvariable add %>% select(-key) following after the end of the fill although you might want to keep it for some reason...



            P.S. I really liked the clever trick in the function creation by @G. Grothendieck with the sign to handle date_from > date_to







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 14 '18 at 14:03









            Lefkios PaikousisLefkios Paikousis

            837




            837

























                2














                First define a month_seq function which produces the desired date sequence given the from and to dates. Then Map it to the date_from and date_thru columns producing a column of type list whose components are the date sequences. Then unnest the month and select the desired columns.



                library(dplyr)
                library(tidyr)

                month_seq <- function(from, to) seq(from, to, paste(sign(to - from), "month"))

                df %>%
                mutate(month = Map(month_seq, date_from, date_thru)) %>%
                unnest %>%
                select(user, line_item, month, mrr)


                giving:



                   user line_item      month   mrr
                1 1001 abc123 2015-01-01 2.22
                2 1001 abc123 2015-02-01 2.22
                3 1001 abc123 2015-03-01 2.22
                4 1001 abc123 2015-04-01 2.22
                5 1002 def456 2015-06-01 -4.44
                6 1002 def456 2015-05-01 -4.44
                7 1002 def456 2015-04-01 -4.44
                8 1002 def456 2015-03-01 -4.44
                9 1002 def456 2015-02-01 -4.44
                10 1002 def456 2015-01-01 -4.44
                11 1002 def456 2014-12-01 -4.44





                share|improve this answer






























                  2














                  First define a month_seq function which produces the desired date sequence given the from and to dates. Then Map it to the date_from and date_thru columns producing a column of type list whose components are the date sequences. Then unnest the month and select the desired columns.



                  library(dplyr)
                  library(tidyr)

                  month_seq <- function(from, to) seq(from, to, paste(sign(to - from), "month"))

                  df %>%
                  mutate(month = Map(month_seq, date_from, date_thru)) %>%
                  unnest %>%
                  select(user, line_item, month, mrr)


                  giving:



                     user line_item      month   mrr
                  1 1001 abc123 2015-01-01 2.22
                  2 1001 abc123 2015-02-01 2.22
                  3 1001 abc123 2015-03-01 2.22
                  4 1001 abc123 2015-04-01 2.22
                  5 1002 def456 2015-06-01 -4.44
                  6 1002 def456 2015-05-01 -4.44
                  7 1002 def456 2015-04-01 -4.44
                  8 1002 def456 2015-03-01 -4.44
                  9 1002 def456 2015-02-01 -4.44
                  10 1002 def456 2015-01-01 -4.44
                  11 1002 def456 2014-12-01 -4.44





                  share|improve this answer




























                    2












                    2








                    2







                    First define a month_seq function which produces the desired date sequence given the from and to dates. Then Map it to the date_from and date_thru columns producing a column of type list whose components are the date sequences. Then unnest the month and select the desired columns.



                    library(dplyr)
                    library(tidyr)

                    month_seq <- function(from, to) seq(from, to, paste(sign(to - from), "month"))

                    df %>%
                    mutate(month = Map(month_seq, date_from, date_thru)) %>%
                    unnest %>%
                    select(user, line_item, month, mrr)


                    giving:



                       user line_item      month   mrr
                    1 1001 abc123 2015-01-01 2.22
                    2 1001 abc123 2015-02-01 2.22
                    3 1001 abc123 2015-03-01 2.22
                    4 1001 abc123 2015-04-01 2.22
                    5 1002 def456 2015-06-01 -4.44
                    6 1002 def456 2015-05-01 -4.44
                    7 1002 def456 2015-04-01 -4.44
                    8 1002 def456 2015-03-01 -4.44
                    9 1002 def456 2015-02-01 -4.44
                    10 1002 def456 2015-01-01 -4.44
                    11 1002 def456 2014-12-01 -4.44





                    share|improve this answer















                    First define a month_seq function which produces the desired date sequence given the from and to dates. Then Map it to the date_from and date_thru columns producing a column of type list whose components are the date sequences. Then unnest the month and select the desired columns.



                    library(dplyr)
                    library(tidyr)

                    month_seq <- function(from, to) seq(from, to, paste(sign(to - from), "month"))

                    df %>%
                    mutate(month = Map(month_seq, date_from, date_thru)) %>%
                    unnest %>%
                    select(user, line_item, month, mrr)


                    giving:



                       user line_item      month   mrr
                    1 1001 abc123 2015-01-01 2.22
                    2 1001 abc123 2015-02-01 2.22
                    3 1001 abc123 2015-03-01 2.22
                    4 1001 abc123 2015-04-01 2.22
                    5 1002 def456 2015-06-01 -4.44
                    6 1002 def456 2015-05-01 -4.44
                    7 1002 def456 2015-04-01 -4.44
                    8 1002 def456 2015-03-01 -4.44
                    9 1002 def456 2015-02-01 -4.44
                    10 1002 def456 2015-01-01 -4.44
                    11 1002 def456 2014-12-01 -4.44






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Nov 14 '18 at 12:50

























                    answered Nov 14 '18 at 3:59









                    G. GrothendieckG. Grothendieck

                    147k9130235




                    147k9130235























                        0














                        Here's a solution using dplyr, tidyr, and lubridate. I found the elapsed_months function on this SO post Number of months between two dates. I have modified it a bit for your case.



                        Also, I am assuming negative mrr indicates that date_from > date_thru.



                        elapsed_months <- function(end_date, start_date) {
                        ed <- as.POSIXlt(end_date)
                        sd <- as.POSIXlt(start_date)
                        12 * (ed$year - sd$year) + (ed$mon - sd$mon) + ifelse(ed >= sd, 1, -1)
                        }

                        df %>%
                        uncount(weights = abs(elapsed_months(date_thru, date_from)), .id = "Months") %>%
                        mutate(
                        Month = date_from %m+% months(sign(mrr)*(Months - 1))
                        ) %>%
                        select(user, line_item, Month, mrr)

                        user line_item Month mrr
                        1 1001 abc123 2015-01-01 2.22
                        2 1001 abc123 2015-02-01 2.22
                        3 1001 abc123 2015-03-01 2.22
                        4 1001 abc123 2015-04-01 2.22
                        5 1002 def456 2015-06-01 -4.44
                        6 1002 def456 2015-05-01 -4.44
                        7 1002 def456 2015-04-01 -4.44
                        8 1002 def456 2015-03-01 -4.44
                        9 1002 def456 2015-02-01 -4.44
                        10 1002 def456 2015-01-01 -4.44
                        11 1002 def456 2014-12-01 -4.44





                        share|improve this answer






























                          0














                          Here's a solution using dplyr, tidyr, and lubridate. I found the elapsed_months function on this SO post Number of months between two dates. I have modified it a bit for your case.



                          Also, I am assuming negative mrr indicates that date_from > date_thru.



                          elapsed_months <- function(end_date, start_date) {
                          ed <- as.POSIXlt(end_date)
                          sd <- as.POSIXlt(start_date)
                          12 * (ed$year - sd$year) + (ed$mon - sd$mon) + ifelse(ed >= sd, 1, -1)
                          }

                          df %>%
                          uncount(weights = abs(elapsed_months(date_thru, date_from)), .id = "Months") %>%
                          mutate(
                          Month = date_from %m+% months(sign(mrr)*(Months - 1))
                          ) %>%
                          select(user, line_item, Month, mrr)

                          user line_item Month mrr
                          1 1001 abc123 2015-01-01 2.22
                          2 1001 abc123 2015-02-01 2.22
                          3 1001 abc123 2015-03-01 2.22
                          4 1001 abc123 2015-04-01 2.22
                          5 1002 def456 2015-06-01 -4.44
                          6 1002 def456 2015-05-01 -4.44
                          7 1002 def456 2015-04-01 -4.44
                          8 1002 def456 2015-03-01 -4.44
                          9 1002 def456 2015-02-01 -4.44
                          10 1002 def456 2015-01-01 -4.44
                          11 1002 def456 2014-12-01 -4.44





                          share|improve this answer




























                            0












                            0








                            0







                            Here's a solution using dplyr, tidyr, and lubridate. I found the elapsed_months function on this SO post Number of months between two dates. I have modified it a bit for your case.



                            Also, I am assuming negative mrr indicates that date_from > date_thru.



                            elapsed_months <- function(end_date, start_date) {
                            ed <- as.POSIXlt(end_date)
                            sd <- as.POSIXlt(start_date)
                            12 * (ed$year - sd$year) + (ed$mon - sd$mon) + ifelse(ed >= sd, 1, -1)
                            }

                            df %>%
                            uncount(weights = abs(elapsed_months(date_thru, date_from)), .id = "Months") %>%
                            mutate(
                            Month = date_from %m+% months(sign(mrr)*(Months - 1))
                            ) %>%
                            select(user, line_item, Month, mrr)

                            user line_item Month mrr
                            1 1001 abc123 2015-01-01 2.22
                            2 1001 abc123 2015-02-01 2.22
                            3 1001 abc123 2015-03-01 2.22
                            4 1001 abc123 2015-04-01 2.22
                            5 1002 def456 2015-06-01 -4.44
                            6 1002 def456 2015-05-01 -4.44
                            7 1002 def456 2015-04-01 -4.44
                            8 1002 def456 2015-03-01 -4.44
                            9 1002 def456 2015-02-01 -4.44
                            10 1002 def456 2015-01-01 -4.44
                            11 1002 def456 2014-12-01 -4.44





                            share|improve this answer















                            Here's a solution using dplyr, tidyr, and lubridate. I found the elapsed_months function on this SO post Number of months between two dates. I have modified it a bit for your case.



                            Also, I am assuming negative mrr indicates that date_from > date_thru.



                            elapsed_months <- function(end_date, start_date) {
                            ed <- as.POSIXlt(end_date)
                            sd <- as.POSIXlt(start_date)
                            12 * (ed$year - sd$year) + (ed$mon - sd$mon) + ifelse(ed >= sd, 1, -1)
                            }

                            df %>%
                            uncount(weights = abs(elapsed_months(date_thru, date_from)), .id = "Months") %>%
                            mutate(
                            Month = date_from %m+% months(sign(mrr)*(Months - 1))
                            ) %>%
                            select(user, line_item, Month, mrr)

                            user line_item Month mrr
                            1 1001 abc123 2015-01-01 2.22
                            2 1001 abc123 2015-02-01 2.22
                            3 1001 abc123 2015-03-01 2.22
                            4 1001 abc123 2015-04-01 2.22
                            5 1002 def456 2015-06-01 -4.44
                            6 1002 def456 2015-05-01 -4.44
                            7 1002 def456 2015-04-01 -4.44
                            8 1002 def456 2015-03-01 -4.44
                            9 1002 def456 2015-02-01 -4.44
                            10 1002 def456 2015-01-01 -4.44
                            11 1002 def456 2014-12-01 -4.44






                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Nov 14 '18 at 3:01

























                            answered Nov 14 '18 at 2:56









                            ShreeShree

                            3,3811323




                            3,3811323






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292044%2fin-r-perform-a-function-over-rows-in-a-data-frame-and-join-the-result-together%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Xamarin.iOS Cant Deploy on Iphone

                                Glorious Revolution

                                Dulmage-Mendelsohn matrix decomposition in Python