keep a column when expanding dataframe to fill using tidyr::complete












1















Regarding how to fill missing rows in a data frame i used this example



df <- read.table(textConnection("car,year,month,country,amount
Mazda,2012,02,JP,2344
Ford,2012,04,US,235234
Mazda,2012,03,JP,3455
Mazda,2012,04,JP,43554
Mazda,2012,05,JP,9854
Mazda,2012,06,JP,32556
Ford, 2013,01,US,345"), sep = ",", header = TRUE)

> df
car year month country amount
1 Mazda 2012 2 JP 2344
2 Ford 2012 4 US 235234
3 Mazda 2012 3 JP 3455
4 Mazda 2012 4 JP 43554
5 Mazda 2012 5 JP 9854
6 Mazda 2012 6 JP 32556
7 Ford 2013 1 US 345


I use tidyr::complete to fill missing rows for month and year this way:



tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0))


but country is lost. i've read tidyr documentation but it's really short and could'nt find any other SO answer on this.



# A tibble: 108 x 5
car year month country amount
<fct> <int> <int> <fct> <dbl>
1 " Ford" 2012 1 NA 0
2 " Ford" 2012 2 NA 0
3 " Ford" 2012 3 NA 0
4 " Ford" 2012 4 US 235234
5 " Ford" 2012 5 NA 0
6 " Ford" 2012 6 NA 0
7 " Ford" 2012 7 NA 0
8 " Ford" 2012 8 NA 0
9 " Ford" 2012 9 NA 0
10 " Ford" 2012 10 NA 0
# ... with 98 more rows


How to preserve it?










share|improve this question























  • Is this the output to the same data? I am not getting the same number of rows by applying your example

    – akrun
    Nov 15 '18 at 17:55











  • yes, double checked. This is my output

    – Forge
    Nov 15 '18 at 17:59











  • For me, the same code is giving 72 x 5. tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)) # A tibble: 72 x 5

    – akrun
    Nov 15 '18 at 18:00


















1















Regarding how to fill missing rows in a data frame i used this example



df <- read.table(textConnection("car,year,month,country,amount
Mazda,2012,02,JP,2344
Ford,2012,04,US,235234
Mazda,2012,03,JP,3455
Mazda,2012,04,JP,43554
Mazda,2012,05,JP,9854
Mazda,2012,06,JP,32556
Ford, 2013,01,US,345"), sep = ",", header = TRUE)

> df
car year month country amount
1 Mazda 2012 2 JP 2344
2 Ford 2012 4 US 235234
3 Mazda 2012 3 JP 3455
4 Mazda 2012 4 JP 43554
5 Mazda 2012 5 JP 9854
6 Mazda 2012 6 JP 32556
7 Ford 2013 1 US 345


I use tidyr::complete to fill missing rows for month and year this way:



tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0))


but country is lost. i've read tidyr documentation but it's really short and could'nt find any other SO answer on this.



# A tibble: 108 x 5
car year month country amount
<fct> <int> <int> <fct> <dbl>
1 " Ford" 2012 1 NA 0
2 " Ford" 2012 2 NA 0
3 " Ford" 2012 3 NA 0
4 " Ford" 2012 4 US 235234
5 " Ford" 2012 5 NA 0
6 " Ford" 2012 6 NA 0
7 " Ford" 2012 7 NA 0
8 " Ford" 2012 8 NA 0
9 " Ford" 2012 9 NA 0
10 " Ford" 2012 10 NA 0
# ... with 98 more rows


How to preserve it?










share|improve this question























  • Is this the output to the same data? I am not getting the same number of rows by applying your example

    – akrun
    Nov 15 '18 at 17:55











  • yes, double checked. This is my output

    – Forge
    Nov 15 '18 at 17:59











  • For me, the same code is giving 72 x 5. tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)) # A tibble: 72 x 5

    – akrun
    Nov 15 '18 at 18:00
















1












1








1








Regarding how to fill missing rows in a data frame i used this example



df <- read.table(textConnection("car,year,month,country,amount
Mazda,2012,02,JP,2344
Ford,2012,04,US,235234
Mazda,2012,03,JP,3455
Mazda,2012,04,JP,43554
Mazda,2012,05,JP,9854
Mazda,2012,06,JP,32556
Ford, 2013,01,US,345"), sep = ",", header = TRUE)

> df
car year month country amount
1 Mazda 2012 2 JP 2344
2 Ford 2012 4 US 235234
3 Mazda 2012 3 JP 3455
4 Mazda 2012 4 JP 43554
5 Mazda 2012 5 JP 9854
6 Mazda 2012 6 JP 32556
7 Ford 2013 1 US 345


I use tidyr::complete to fill missing rows for month and year this way:



tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0))


but country is lost. i've read tidyr documentation but it's really short and could'nt find any other SO answer on this.



# A tibble: 108 x 5
car year month country amount
<fct> <int> <int> <fct> <dbl>
1 " Ford" 2012 1 NA 0
2 " Ford" 2012 2 NA 0
3 " Ford" 2012 3 NA 0
4 " Ford" 2012 4 US 235234
5 " Ford" 2012 5 NA 0
6 " Ford" 2012 6 NA 0
7 " Ford" 2012 7 NA 0
8 " Ford" 2012 8 NA 0
9 " Ford" 2012 9 NA 0
10 " Ford" 2012 10 NA 0
# ... with 98 more rows


How to preserve it?










share|improve this question














Regarding how to fill missing rows in a data frame i used this example



df <- read.table(textConnection("car,year,month,country,amount
Mazda,2012,02,JP,2344
Ford,2012,04,US,235234
Mazda,2012,03,JP,3455
Mazda,2012,04,JP,43554
Mazda,2012,05,JP,9854
Mazda,2012,06,JP,32556
Ford, 2013,01,US,345"), sep = ",", header = TRUE)

> df
car year month country amount
1 Mazda 2012 2 JP 2344
2 Ford 2012 4 US 235234
3 Mazda 2012 3 JP 3455
4 Mazda 2012 4 JP 43554
5 Mazda 2012 5 JP 9854
6 Mazda 2012 6 JP 32556
7 Ford 2013 1 US 345


I use tidyr::complete to fill missing rows for month and year this way:



tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0))


but country is lost. i've read tidyr documentation but it's really short and could'nt find any other SO answer on this.



# A tibble: 108 x 5
car year month country amount
<fct> <int> <int> <fct> <dbl>
1 " Ford" 2012 1 NA 0
2 " Ford" 2012 2 NA 0
3 " Ford" 2012 3 NA 0
4 " Ford" 2012 4 US 235234
5 " Ford" 2012 5 NA 0
6 " Ford" 2012 6 NA 0
7 " Ford" 2012 7 NA 0
8 " Ford" 2012 8 NA 0
9 " Ford" 2012 9 NA 0
10 " Ford" 2012 10 NA 0
# ... with 98 more rows


How to preserve it?







r tidyr






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 17:47









ForgeForge

386316




386316













  • Is this the output to the same data? I am not getting the same number of rows by applying your example

    – akrun
    Nov 15 '18 at 17:55











  • yes, double checked. This is my output

    – Forge
    Nov 15 '18 at 17:59











  • For me, the same code is giving 72 x 5. tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)) # A tibble: 72 x 5

    – akrun
    Nov 15 '18 at 18:00





















  • Is this the output to the same data? I am not getting the same number of rows by applying your example

    – akrun
    Nov 15 '18 at 17:55











  • yes, double checked. This is my output

    – Forge
    Nov 15 '18 at 17:59











  • For me, the same code is giving 72 x 5. tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)) # A tibble: 72 x 5

    – akrun
    Nov 15 '18 at 18:00



















Is this the output to the same data? I am not getting the same number of rows by applying your example

– akrun
Nov 15 '18 at 17:55





Is this the output to the same data? I am not getting the same number of rows by applying your example

– akrun
Nov 15 '18 at 17:55













yes, double checked. This is my output

– Forge
Nov 15 '18 at 17:59





yes, double checked. This is my output

– Forge
Nov 15 '18 at 17:59













For me, the same code is giving 72 x 5. tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)) # A tibble: 72 x 5

– akrun
Nov 15 '18 at 18:00







For me, the same code is giving 72 x 5. tidyr::complete(df, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)) # A tibble: 72 x 5

– akrun
Nov 15 '18 at 18:00














2 Answers
2






active

oldest

votes


















1














Since you neglected to note that you opened a new question in the second-ask on the original, just maintain a metadata data frame:



read.table(textConnection("car,year,month,amount
Mazda,2012,02,2344
Ford,2012,04,235234
Mazda,2012,03,3455
Mazda,2012,04,43554
Mazda,2012,05,9854
Mazda,2012,06,32556
Ford,2013,01,2345"),
sep = ",", header = TRUE, stringsAsFactors = FALSE) -> xdf


merge(
expand.grid(car = unique(xdf$car), year =2012:2014, month=1:12),
xdf, by = c("car", "year", "month"), all.x = TRUE
) -> x2
x2$amount <- ifelse(is.na(x2$amount), 0, x2$amount)

data.frame(
car = c("Mazda", "Ford"),
country = c("JP", "US"),
stringsAsFactors = FALSE
) -> car2country_df

merge(x2, car2country_df)


or via tidyverse:



tidyr::complete(
xdf, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)
) %>%
dplyr::left_join(car2country_df)





share|improve this answer
























  • my bad, I'm instructed to do separate questions on separate topics even if it's a follow-on previous question. I forgot to warn you. apologies

    – Forge
    Nov 16 '18 at 7:21





















1














We can place it in nesting



library(tidyverse)
df %>%
complete(car = unique(car), year = 2012:2014, month = 1:12,
nesting(country), fill = list(amount = 0))





share|improve this answer
























  • Works for more than one column?

    – Forge
    Nov 15 '18 at 18:03











  • @Forge Yes, you can place the colums inside nesting i.e. nesting(col1, col2),

    – akrun
    Nov 15 '18 at 18:04











  • Not valid. Nesting does not preserve country. it returned Ford JP and Mazda USA combinations @akrun

    – Forge
    Nov 16 '18 at 7:58













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53325216%2fkeep-a-column-when-expanding-dataframe-to-fill-using-tidyrcomplete%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Since you neglected to note that you opened a new question in the second-ask on the original, just maintain a metadata data frame:



read.table(textConnection("car,year,month,amount
Mazda,2012,02,2344
Ford,2012,04,235234
Mazda,2012,03,3455
Mazda,2012,04,43554
Mazda,2012,05,9854
Mazda,2012,06,32556
Ford,2013,01,2345"),
sep = ",", header = TRUE, stringsAsFactors = FALSE) -> xdf


merge(
expand.grid(car = unique(xdf$car), year =2012:2014, month=1:12),
xdf, by = c("car", "year", "month"), all.x = TRUE
) -> x2
x2$amount <- ifelse(is.na(x2$amount), 0, x2$amount)

data.frame(
car = c("Mazda", "Ford"),
country = c("JP", "US"),
stringsAsFactors = FALSE
) -> car2country_df

merge(x2, car2country_df)


or via tidyverse:



tidyr::complete(
xdf, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)
) %>%
dplyr::left_join(car2country_df)





share|improve this answer
























  • my bad, I'm instructed to do separate questions on separate topics even if it's a follow-on previous question. I forgot to warn you. apologies

    – Forge
    Nov 16 '18 at 7:21


















1














Since you neglected to note that you opened a new question in the second-ask on the original, just maintain a metadata data frame:



read.table(textConnection("car,year,month,amount
Mazda,2012,02,2344
Ford,2012,04,235234
Mazda,2012,03,3455
Mazda,2012,04,43554
Mazda,2012,05,9854
Mazda,2012,06,32556
Ford,2013,01,2345"),
sep = ",", header = TRUE, stringsAsFactors = FALSE) -> xdf


merge(
expand.grid(car = unique(xdf$car), year =2012:2014, month=1:12),
xdf, by = c("car", "year", "month"), all.x = TRUE
) -> x2
x2$amount <- ifelse(is.na(x2$amount), 0, x2$amount)

data.frame(
car = c("Mazda", "Ford"),
country = c("JP", "US"),
stringsAsFactors = FALSE
) -> car2country_df

merge(x2, car2country_df)


or via tidyverse:



tidyr::complete(
xdf, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)
) %>%
dplyr::left_join(car2country_df)





share|improve this answer
























  • my bad, I'm instructed to do separate questions on separate topics even if it's a follow-on previous question. I forgot to warn you. apologies

    – Forge
    Nov 16 '18 at 7:21
















1












1








1







Since you neglected to note that you opened a new question in the second-ask on the original, just maintain a metadata data frame:



read.table(textConnection("car,year,month,amount
Mazda,2012,02,2344
Ford,2012,04,235234
Mazda,2012,03,3455
Mazda,2012,04,43554
Mazda,2012,05,9854
Mazda,2012,06,32556
Ford,2013,01,2345"),
sep = ",", header = TRUE, stringsAsFactors = FALSE) -> xdf


merge(
expand.grid(car = unique(xdf$car), year =2012:2014, month=1:12),
xdf, by = c("car", "year", "month"), all.x = TRUE
) -> x2
x2$amount <- ifelse(is.na(x2$amount), 0, x2$amount)

data.frame(
car = c("Mazda", "Ford"),
country = c("JP", "US"),
stringsAsFactors = FALSE
) -> car2country_df

merge(x2, car2country_df)


or via tidyverse:



tidyr::complete(
xdf, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)
) %>%
dplyr::left_join(car2country_df)





share|improve this answer













Since you neglected to note that you opened a new question in the second-ask on the original, just maintain a metadata data frame:



read.table(textConnection("car,year,month,amount
Mazda,2012,02,2344
Ford,2012,04,235234
Mazda,2012,03,3455
Mazda,2012,04,43554
Mazda,2012,05,9854
Mazda,2012,06,32556
Ford,2013,01,2345"),
sep = ",", header = TRUE, stringsAsFactors = FALSE) -> xdf


merge(
expand.grid(car = unique(xdf$car), year =2012:2014, month=1:12),
xdf, by = c("car", "year", "month"), all.x = TRUE
) -> x2
x2$amount <- ifelse(is.na(x2$amount), 0, x2$amount)

data.frame(
car = c("Mazda", "Ford"),
country = c("JP", "US"),
stringsAsFactors = FALSE
) -> car2country_df

merge(x2, car2country_df)


or via tidyverse:



tidyr::complete(
xdf, car = unique(car), year = 2012:2014, month=1:12, fill=list(amount=0)
) %>%
dplyr::left_join(car2country_df)






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 15 '18 at 18:23









hrbrmstrhrbrmstr

61.6k691153




61.6k691153













  • my bad, I'm instructed to do separate questions on separate topics even if it's a follow-on previous question. I forgot to warn you. apologies

    – Forge
    Nov 16 '18 at 7:21





















  • my bad, I'm instructed to do separate questions on separate topics even if it's a follow-on previous question. I forgot to warn you. apologies

    – Forge
    Nov 16 '18 at 7:21



















my bad, I'm instructed to do separate questions on separate topics even if it's a follow-on previous question. I forgot to warn you. apologies

– Forge
Nov 16 '18 at 7:21







my bad, I'm instructed to do separate questions on separate topics even if it's a follow-on previous question. I forgot to warn you. apologies

– Forge
Nov 16 '18 at 7:21















1














We can place it in nesting



library(tidyverse)
df %>%
complete(car = unique(car), year = 2012:2014, month = 1:12,
nesting(country), fill = list(amount = 0))





share|improve this answer
























  • Works for more than one column?

    – Forge
    Nov 15 '18 at 18:03











  • @Forge Yes, you can place the colums inside nesting i.e. nesting(col1, col2),

    – akrun
    Nov 15 '18 at 18:04











  • Not valid. Nesting does not preserve country. it returned Ford JP and Mazda USA combinations @akrun

    – Forge
    Nov 16 '18 at 7:58


















1














We can place it in nesting



library(tidyverse)
df %>%
complete(car = unique(car), year = 2012:2014, month = 1:12,
nesting(country), fill = list(amount = 0))





share|improve this answer
























  • Works for more than one column?

    – Forge
    Nov 15 '18 at 18:03











  • @Forge Yes, you can place the colums inside nesting i.e. nesting(col1, col2),

    – akrun
    Nov 15 '18 at 18:04











  • Not valid. Nesting does not preserve country. it returned Ford JP and Mazda USA combinations @akrun

    – Forge
    Nov 16 '18 at 7:58
















1












1








1







We can place it in nesting



library(tidyverse)
df %>%
complete(car = unique(car), year = 2012:2014, month = 1:12,
nesting(country), fill = list(amount = 0))





share|improve this answer













We can place it in nesting



library(tidyverse)
df %>%
complete(car = unique(car), year = 2012:2014, month = 1:12,
nesting(country), fill = list(amount = 0))






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 15 '18 at 17:49









akrunakrun

414k13201275




414k13201275













  • Works for more than one column?

    – Forge
    Nov 15 '18 at 18:03











  • @Forge Yes, you can place the colums inside nesting i.e. nesting(col1, col2),

    – akrun
    Nov 15 '18 at 18:04











  • Not valid. Nesting does not preserve country. it returned Ford JP and Mazda USA combinations @akrun

    – Forge
    Nov 16 '18 at 7:58





















  • Works for more than one column?

    – Forge
    Nov 15 '18 at 18:03











  • @Forge Yes, you can place the colums inside nesting i.e. nesting(col1, col2),

    – akrun
    Nov 15 '18 at 18:04











  • Not valid. Nesting does not preserve country. it returned Ford JP and Mazda USA combinations @akrun

    – Forge
    Nov 16 '18 at 7:58



















Works for more than one column?

– Forge
Nov 15 '18 at 18:03





Works for more than one column?

– Forge
Nov 15 '18 at 18:03













@Forge Yes, you can place the colums inside nesting i.e. nesting(col1, col2),

– akrun
Nov 15 '18 at 18:04





@Forge Yes, you can place the colums inside nesting i.e. nesting(col1, col2),

– akrun
Nov 15 '18 at 18:04













Not valid. Nesting does not preserve country. it returned Ford JP and Mazda USA combinations @akrun

– Forge
Nov 16 '18 at 7:58







Not valid. Nesting does not preserve country. it returned Ford JP and Mazda USA combinations @akrun

– Forge
Nov 16 '18 at 7:58




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53325216%2fkeep-a-column-when-expanding-dataframe-to-fill-using-tidyrcomplete%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Xamarin.iOS Cant Deploy on Iphone

Glorious Revolution

Dulmage-Mendelsohn matrix decomposition in Python