creating a new dataframe based on differen data in rows
I'm moving an operation from Excel Power Query to R, which is much faster. The result is I have a data frame with thousands of rows, however, I'm looking to create a sample data frame that includes one row for every different option (factor level)for columns 5:10 of 15 columns, so people can manually test every option (like a truth table?)
I could manually do this, but I wondered if I could do it automatically.
col1 col2 col3
name option1 option2
name2 option1 option2
name3 option1 option2
name4 option2 option1
would be converted into a data frame like this:
col1 col2 col3
name option1 option2
name4 option2 option1
any help would be greatly appreciated.
Chris
r dataframe rstudio tidyverse
add a comment |
I'm moving an operation from Excel Power Query to R, which is much faster. The result is I have a data frame with thousands of rows, however, I'm looking to create a sample data frame that includes one row for every different option (factor level)for columns 5:10 of 15 columns, so people can manually test every option (like a truth table?)
I could manually do this, but I wondered if I could do it automatically.
col1 col2 col3
name option1 option2
name2 option1 option2
name3 option1 option2
name4 option2 option1
would be converted into a data frame like this:
col1 col2 col3
name option1 option2
name4 option2 option1
any help would be greatly appreciated.
Chris
r dataframe rstudio tidyverse
see?duplicated
– Bastien
Nov 14 '18 at 13:46
add a comment |
I'm moving an operation from Excel Power Query to R, which is much faster. The result is I have a data frame with thousands of rows, however, I'm looking to create a sample data frame that includes one row for every different option (factor level)for columns 5:10 of 15 columns, so people can manually test every option (like a truth table?)
I could manually do this, but I wondered if I could do it automatically.
col1 col2 col3
name option1 option2
name2 option1 option2
name3 option1 option2
name4 option2 option1
would be converted into a data frame like this:
col1 col2 col3
name option1 option2
name4 option2 option1
any help would be greatly appreciated.
Chris
r dataframe rstudio tidyverse
I'm moving an operation from Excel Power Query to R, which is much faster. The result is I have a data frame with thousands of rows, however, I'm looking to create a sample data frame that includes one row for every different option (factor level)for columns 5:10 of 15 columns, so people can manually test every option (like a truth table?)
I could manually do this, but I wondered if I could do it automatically.
col1 col2 col3
name option1 option2
name2 option1 option2
name3 option1 option2
name4 option2 option1
would be converted into a data frame like this:
col1 col2 col3
name option1 option2
name4 option2 option1
any help would be greatly appreciated.
Chris
r dataframe rstudio tidyverse
r dataframe rstudio tidyverse
asked Nov 14 '18 at 13:42
ChrisChris
9011
9011
see?duplicated
– Bastien
Nov 14 '18 at 13:46
add a comment |
see?duplicated
– Bastien
Nov 14 '18 at 13:46
see
?duplicated
– Bastien
Nov 14 '18 at 13:46
see
?duplicated
– Bastien
Nov 14 '18 at 13:46
add a comment |
1 Answer
1
active
oldest
votes
With dplyr
:
library(dplyr)
d %>% distinct(col2, col3, .keep_all=T)
# col1 col2 col3
# 1 name option1 option2
# 2 name4 option2 option1
If you want to use distinct
only for a subset of columns, you can match first a regex:
d %>%
select(matches("[5-10]|[1]")) %>% # this selects only rows from 5 to 10 or 1 in the name
distinct(.keep_all=T)
This will have your first row "col1"
, and all the rows "col5"
to "col10"
.
Thanks for the quick reply. In my data frame I end up with 19 thousand rows still (from 39,000) I'm after a table with about 100 - 200 lines, but perhaps my maths is way off. rather than have a new row for every distinct option, have rows which may take care of many options per row, so reducing the number of lines...does that make any sense?
– Chris
Nov 14 '18 at 14:09
mm no, I'm sorry but you need to be more clear. Try updating your question with a reproducible example where you show what you expect.
– RLave
Nov 14 '18 at 14:15
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53301627%2fcreating-a-new-dataframe-based-on-differen-data-in-rows%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
With dplyr
:
library(dplyr)
d %>% distinct(col2, col3, .keep_all=T)
# col1 col2 col3
# 1 name option1 option2
# 2 name4 option2 option1
If you want to use distinct
only for a subset of columns, you can match first a regex:
d %>%
select(matches("[5-10]|[1]")) %>% # this selects only rows from 5 to 10 or 1 in the name
distinct(.keep_all=T)
This will have your first row "col1"
, and all the rows "col5"
to "col10"
.
Thanks for the quick reply. In my data frame I end up with 19 thousand rows still (from 39,000) I'm after a table with about 100 - 200 lines, but perhaps my maths is way off. rather than have a new row for every distinct option, have rows which may take care of many options per row, so reducing the number of lines...does that make any sense?
– Chris
Nov 14 '18 at 14:09
mm no, I'm sorry but you need to be more clear. Try updating your question with a reproducible example where you show what you expect.
– RLave
Nov 14 '18 at 14:15
add a comment |
With dplyr
:
library(dplyr)
d %>% distinct(col2, col3, .keep_all=T)
# col1 col2 col3
# 1 name option1 option2
# 2 name4 option2 option1
If you want to use distinct
only for a subset of columns, you can match first a regex:
d %>%
select(matches("[5-10]|[1]")) %>% # this selects only rows from 5 to 10 or 1 in the name
distinct(.keep_all=T)
This will have your first row "col1"
, and all the rows "col5"
to "col10"
.
Thanks for the quick reply. In my data frame I end up with 19 thousand rows still (from 39,000) I'm after a table with about 100 - 200 lines, but perhaps my maths is way off. rather than have a new row for every distinct option, have rows which may take care of many options per row, so reducing the number of lines...does that make any sense?
– Chris
Nov 14 '18 at 14:09
mm no, I'm sorry but you need to be more clear. Try updating your question with a reproducible example where you show what you expect.
– RLave
Nov 14 '18 at 14:15
add a comment |
With dplyr
:
library(dplyr)
d %>% distinct(col2, col3, .keep_all=T)
# col1 col2 col3
# 1 name option1 option2
# 2 name4 option2 option1
If you want to use distinct
only for a subset of columns, you can match first a regex:
d %>%
select(matches("[5-10]|[1]")) %>% # this selects only rows from 5 to 10 or 1 in the name
distinct(.keep_all=T)
This will have your first row "col1"
, and all the rows "col5"
to "col10"
.
With dplyr
:
library(dplyr)
d %>% distinct(col2, col3, .keep_all=T)
# col1 col2 col3
# 1 name option1 option2
# 2 name4 option2 option1
If you want to use distinct
only for a subset of columns, you can match first a regex:
d %>%
select(matches("[5-10]|[1]")) %>% # this selects only rows from 5 to 10 or 1 in the name
distinct(.keep_all=T)
This will have your first row "col1"
, and all the rows "col5"
to "col10"
.
edited Nov 14 '18 at 14:05
answered Nov 14 '18 at 13:45
RLaveRLave
4,42711023
4,42711023
Thanks for the quick reply. In my data frame I end up with 19 thousand rows still (from 39,000) I'm after a table with about 100 - 200 lines, but perhaps my maths is way off. rather than have a new row for every distinct option, have rows which may take care of many options per row, so reducing the number of lines...does that make any sense?
– Chris
Nov 14 '18 at 14:09
mm no, I'm sorry but you need to be more clear. Try updating your question with a reproducible example where you show what you expect.
– RLave
Nov 14 '18 at 14:15
add a comment |
Thanks for the quick reply. In my data frame I end up with 19 thousand rows still (from 39,000) I'm after a table with about 100 - 200 lines, but perhaps my maths is way off. rather than have a new row for every distinct option, have rows which may take care of many options per row, so reducing the number of lines...does that make any sense?
– Chris
Nov 14 '18 at 14:09
mm no, I'm sorry but you need to be more clear. Try updating your question with a reproducible example where you show what you expect.
– RLave
Nov 14 '18 at 14:15
Thanks for the quick reply. In my data frame I end up with 19 thousand rows still (from 39,000) I'm after a table with about 100 - 200 lines, but perhaps my maths is way off. rather than have a new row for every distinct option, have rows which may take care of many options per row, so reducing the number of lines...does that make any sense?
– Chris
Nov 14 '18 at 14:09
Thanks for the quick reply. In my data frame I end up with 19 thousand rows still (from 39,000) I'm after a table with about 100 - 200 lines, but perhaps my maths is way off. rather than have a new row for every distinct option, have rows which may take care of many options per row, so reducing the number of lines...does that make any sense?
– Chris
Nov 14 '18 at 14:09
mm no, I'm sorry but you need to be more clear. Try updating your question with a reproducible example where you show what you expect.
– RLave
Nov 14 '18 at 14:15
mm no, I'm sorry but you need to be more clear. Try updating your question with a reproducible example where you show what you expect.
– RLave
Nov 14 '18 at 14:15
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53301627%2fcreating-a-new-dataframe-based-on-differen-data-in-rows%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
see
?duplicated
– Bastien
Nov 14 '18 at 13:46