Successive sum of column only if another column has the good value in R

I currently have a dataframe looking like that (with time in seconds and Zone1 a boolean):

I'd like to have the sum of values for successive criteria so I would get something like this:

Time Zone1 TimeInZone

   1     0         NA

   3     0         NA

   4     1          2

   5     1          2

   6     1          2

   7     0         NA

   9     1          1

   10    1          1

So like this

I can't find what to do, how can I deal with that?
Thanks.

EDITED: More accurate dataframe

edited Nov 14 '18 at 13:37

asked Nov 14 '18 at 12:52

Clément POIRET

277

So you want each row where Zone1 is true to have the length of the previous Zone1 run?

– tstenner
Nov 14 '18 at 20:15

Yes it's that, but your code is working, I'll mark it as a solution :)

– Clément POIRET
Nov 15 '18 at 14:57

add a comment |

I currently have a dataframe looking like that (with time in seconds and Zone1 a boolean):

I'd like to have the sum of values for successive criteria so I would get something like this:

Time Zone1 TimeInZone

   1     0         NA

   3     0         NA

   4     1          2

   5     1          2

   6     1          2

   7     0         NA

   9     1          1

   10    1          1

So like this

I can't find what to do, how can I deal with that?
Thanks.

EDITED: More accurate dataframe

edited Nov 14 '18 at 13:37

asked Nov 14 '18 at 12:52

Clément POIRET

277

So you want each row where Zone1 is true to have the length of the previous Zone1 run?

– tstenner
Nov 14 '18 at 20:15

Yes it's that, but your code is working, I'll mark it as a solution :)

– Clément POIRET
Nov 15 '18 at 14:57

add a comment |

I currently have a dataframe looking like that (with time in seconds and Zone1 a boolean):

I'd like to have the sum of values for successive criteria so I would get something like this:

Time Zone1 TimeInZone

   1     0         NA

   3     0         NA

   4     1          2

   5     1          2

   6     1          2

   7     0         NA

   9     1          1

   10    1          1

So like this

I can't find what to do, how can I deal with that?
Thanks.

EDITED: More accurate dataframe

edited Nov 14 '18 at 13:37

asked Nov 14 '18 at 12:52

Clément POIRET

277

I currently have a dataframe looking like that (with time in seconds and Zone1 a boolean):

I'd like to have the sum of values for successive criteria so I would get something like this:

Time Zone1 TimeInZone

   1     0         NA

   3     0         NA

   4     1          2

   5     1          2

   6     1          2

   7     0         NA

   9     1          1

   10    1          1

So like this

I can't find what to do, how can I deal with that?
Thanks.

EDITED: More accurate dataframe

r sum

edited Nov 14 '18 at 13:37

asked Nov 14 '18 at 12:52

Clément POIRET

277

edited Nov 14 '18 at 13:37

asked Nov 14 '18 at 12:52

Clément POIRET

277

edited Nov 14 '18 at 13:37

asked Nov 14 '18 at 12:52

Clément POIRET

277

asked Nov 14 '18 at 12:52

Clément POIRET

277

asked Nov 14 '18 at 12:52

Clément POIRET

277

So you want each row where Zone1 is true to have the length of the previous Zone1 run?

– tstenner
Nov 14 '18 at 20:15

Yes it's that, but your code is working, I'll mark it as a solution :)

– Clément POIRET
Nov 15 '18 at 14:57

add a comment |

So you want each row where Zone1 is true to have the length of the previous Zone1 run?

– tstenner
Nov 14 '18 at 20:15

Yes it's that, but your code is working, I'll mark it as a solution :)

– Clément POIRET
Nov 15 '18 at 14:57

So you want each row where Zone1 is true to have the length of the previous Zone1 run?

– tstenner
Nov 14 '18 at 20:15

Yes it's that, but your code is working, I'll mark it as a solution :)

– Clément POIRET
Nov 15 '18 at 14:57

add a comment |

1 Answer
1

active

oldest

votes

I'm not entirely sure, where the last two rows came from, but here's my take on it:

library(data.table)

df <- data.table(Value=c(3,4,1,1,2), Criteria=c(1,1,2,1,3))

# First, generate a logical vector that indicates if the criterium changed:

df[, changed:=c(TRUE, Criteria[-1] != Criteria[-length(Criteria)])]

# Then, calculate the cumulative sum to get an index:

df[, index:=cumsum(changed)]

# Calculate the sum for each level of index:

df[, Sum:=sum(Value), by=index]

# print everything:

print(df)

Result:

   Value Criteria changed index Sum

1:     3        1    TRUE     1   7

2:     4        1   FALSE     1   7

3:     1        2    TRUE     2   1

4:     1        1    TRUE     3   1

5:     2        3    TRUE     4   2

To have the sum of the last block, use some data.table magic:

setkey(df, index)

nextblocksums <- df[index!=max(index), .(index=index+1,nextBlockSum=Sum)]

df[ nextblocksums , LastBlocksSum:=i.nextBlockSum]

edited Nov 16 '18 at 9:34

answered Nov 14 '18 at 13:07

tstenner

6,32964277

Thanks, it's working but it appears that my dataframe was a bit more complicated than just sum, in fact I need a time so I updated my first post. The time is the difference between last and first value in the range. Do you have an idea?

– Clément POIRET
Nov 14 '18 at 13:45

I managed to get it work by replacing sum by doing the difference between last and first line of a df[df$index == i]$Time in a For loop! Thanks!

– Clément POIRET
Nov 15 '18 at 14:58

That works, but it's slow for anything but tiny amounts of data. See my edit for a faster solution.

– tstenner
Nov 15 '18 at 16:04

Hmm pretty complicated ahah, I've troubles understanding this and I'm getting NA in LastBlocksSum

– Clément POIRET
Nov 15 '18 at 18:39

Of course there's a NA in the first LastBlocksSum, because there's no previous block for that one :-) I've clarified the last step somewhat, just remember that with data.tables the first part (df[i, j]) can also be a join, try df[ nextblocksums, ] to see it in action.

– tstenner
Nov 16 '18 at 9:36

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53300700%2fsuccessive-sum-of-column-only-if-another-column-has-the-good-value-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I'm not entirely sure, where the last two rows came from, but here's my take on it:

library(data.table)

df <- data.table(Value=c(3,4,1,1,2), Criteria=c(1,1,2,1,3))

# First, generate a logical vector that indicates if the criterium changed:

df[, changed:=c(TRUE, Criteria[-1] != Criteria[-length(Criteria)])]

# Then, calculate the cumulative sum to get an index:

df[, index:=cumsum(changed)]

# Calculate the sum for each level of index:

df[, Sum:=sum(Value), by=index]

# print everything:

print(df)

Result:

   Value Criteria changed index Sum

1:     3        1    TRUE     1   7

2:     4        1   FALSE     1   7

3:     1        2    TRUE     2   1

4:     1        1    TRUE     3   1

5:     2        3    TRUE     4   2

To have the sum of the last block, use some data.table magic:

setkey(df, index)

nextblocksums <- df[index!=max(index), .(index=index+1,nextBlockSum=Sum)]

df[ nextblocksums , LastBlocksSum:=i.nextBlockSum]

edited Nov 16 '18 at 9:34

answered Nov 14 '18 at 13:07

tstenner

6,32964277

Thanks, it's working but it appears that my dataframe was a bit more complicated than just sum, in fact I need a time so I updated my first post. The time is the difference between last and first value in the range. Do you have an idea?

– Clément POIRET
Nov 14 '18 at 13:45

I managed to get it work by replacing sum by doing the difference between last and first line of a df[df$index == i]$Time in a For loop! Thanks!

– Clément POIRET
Nov 15 '18 at 14:58

That works, but it's slow for anything but tiny amounts of data. See my edit for a faster solution.

– tstenner
Nov 15 '18 at 16:04

Hmm pretty complicated ahah, I've troubles understanding this and I'm getting NA in LastBlocksSum

– Clément POIRET
Nov 15 '18 at 18:39

Of course there's a NA in the first LastBlocksSum, because there's no previous block for that one :-) I've clarified the last step somewhat, just remember that with data.tables the first part (df[i, j]) can also be a join, try df[ nextblocksums, ] to see it in action.

– tstenner
Nov 16 '18 at 9:36

add a comment |

I'm not entirely sure, where the last two rows came from, but here's my take on it:

library(data.table)

df <- data.table(Value=c(3,4,1,1,2), Criteria=c(1,1,2,1,3))

# First, generate a logical vector that indicates if the criterium changed:

df[, changed:=c(TRUE, Criteria[-1] != Criteria[-length(Criteria)])]

# Then, calculate the cumulative sum to get an index:

df[, index:=cumsum(changed)]

# Calculate the sum for each level of index:

df[, Sum:=sum(Value), by=index]

# print everything:

print(df)

Result:

   Value Criteria changed index Sum

1:     3        1    TRUE     1   7

2:     4        1   FALSE     1   7

3:     1        2    TRUE     2   1

4:     1        1    TRUE     3   1

5:     2        3    TRUE     4   2

To have the sum of the last block, use some data.table magic:

setkey(df, index)

nextblocksums <- df[index!=max(index), .(index=index+1,nextBlockSum=Sum)]

df[ nextblocksums , LastBlocksSum:=i.nextBlockSum]

edited Nov 16 '18 at 9:34

answered Nov 14 '18 at 13:07

tstenner

6,32964277

Thanks, it's working but it appears that my dataframe was a bit more complicated than just sum, in fact I need a time so I updated my first post. The time is the difference between last and first value in the range. Do you have an idea?

– Clément POIRET
Nov 14 '18 at 13:45

I managed to get it work by replacing sum by doing the difference between last and first line of a df[df$index == i]$Time in a For loop! Thanks!

– Clément POIRET
Nov 15 '18 at 14:58

That works, but it's slow for anything but tiny amounts of data. See my edit for a faster solution.

– tstenner
Nov 15 '18 at 16:04

Hmm pretty complicated ahah, I've troubles understanding this and I'm getting NA in LastBlocksSum

– Clément POIRET
Nov 15 '18 at 18:39

Of course there's a NA in the first LastBlocksSum, because there's no previous block for that one :-) I've clarified the last step somewhat, just remember that with data.tables the first part (df[i, j]) can also be a join, try df[ nextblocksums, ] to see it in action.

– tstenner
Nov 16 '18 at 9:36

add a comment |

I'm not entirely sure, where the last two rows came from, but here's my take on it:

library(data.table)

df <- data.table(Value=c(3,4,1,1,2), Criteria=c(1,1,2,1,3))

# First, generate a logical vector that indicates if the criterium changed:

df[, changed:=c(TRUE, Criteria[-1] != Criteria[-length(Criteria)])]

# Then, calculate the cumulative sum to get an index:

df[, index:=cumsum(changed)]

# Calculate the sum for each level of index:

df[, Sum:=sum(Value), by=index]

# print everything:

print(df)

Result:

   Value Criteria changed index Sum

1:     3        1    TRUE     1   7

2:     4        1   FALSE     1   7

3:     1        2    TRUE     2   1

4:     1        1    TRUE     3   1

5:     2        3    TRUE     4   2

To have the sum of the last block, use some data.table magic:

setkey(df, index)

nextblocksums <- df[index!=max(index), .(index=index+1,nextBlockSum=Sum)]

df[ nextblocksums , LastBlocksSum:=i.nextBlockSum]

edited Nov 16 '18 at 9:34

answered Nov 14 '18 at 13:07

tstenner

6,32964277

I'm not entirely sure, where the last two rows came from, but here's my take on it:

library(data.table)

df <- data.table(Value=c(3,4,1,1,2), Criteria=c(1,1,2,1,3))

# First, generate a logical vector that indicates if the criterium changed:

df[, changed:=c(TRUE, Criteria[-1] != Criteria[-length(Criteria)])]

# Then, calculate the cumulative sum to get an index:

df[, index:=cumsum(changed)]

# Calculate the sum for each level of index:

df[, Sum:=sum(Value), by=index]

# print everything:

print(df)

Result:

   Value Criteria changed index Sum

1:     3        1    TRUE     1   7

2:     4        1   FALSE     1   7

3:     1        2    TRUE     2   1

4:     1        1    TRUE     3   1

5:     2        3    TRUE     4   2

To have the sum of the last block, use some data.table magic:

setkey(df, index)

nextblocksums <- df[index!=max(index), .(index=index+1,nextBlockSum=Sum)]

df[ nextblocksums , LastBlocksSum:=i.nextBlockSum]

edited Nov 16 '18 at 9:34

answered Nov 14 '18 at 13:07

tstenner

6,32964277

edited Nov 16 '18 at 9:34

answered Nov 14 '18 at 13:07

tstenner

6,32964277

answered Nov 14 '18 at 13:07

tstenner

6,32964277

answered Nov 14 '18 at 13:07

tstenner

6,32964277

Thanks, it's working but it appears that my dataframe was a bit more complicated than just sum, in fact I need a time so I updated my first post. The time is the difference between last and first value in the range. Do you have an idea?

– Clément POIRET
Nov 14 '18 at 13:45

I managed to get it work by replacing sum by doing the difference between last and first line of a df[df$index == i]$Time in a For loop! Thanks!

– Clément POIRET
Nov 15 '18 at 14:58

That works, but it's slow for anything but tiny amounts of data. See my edit for a faster solution.

– tstenner
Nov 15 '18 at 16:04

Hmm pretty complicated ahah, I've troubles understanding this and I'm getting NA in LastBlocksSum

– Clément POIRET
Nov 15 '18 at 18:39

Of course there's a NA in the first LastBlocksSum, because there's no previous block for that one :-) I've clarified the last step somewhat, just remember that with data.tables the first part (df[i, j]) can also be a join, try df[ nextblocksums, ] to see it in action.

– tstenner
Nov 16 '18 at 9:36

add a comment |

Thanks, it's working but it appears that my dataframe was a bit more complicated than just sum, in fact I need a time so I updated my first post. The time is the difference between last and first value in the range. Do you have an idea?

– Clément POIRET
Nov 14 '18 at 13:45

I managed to get it work by replacing sum by doing the difference between last and first line of a df[df$index == i]$Time in a For loop! Thanks!

– Clément POIRET
Nov 15 '18 at 14:58

That works, but it's slow for anything but tiny amounts of data. See my edit for a faster solution.

– tstenner
Nov 15 '18 at 16:04

Hmm pretty complicated ahah, I've troubles understanding this and I'm getting NA in LastBlocksSum

– Clément POIRET
Nov 15 '18 at 18:39

Of course there's a NA in the first LastBlocksSum, because there's no previous block for that one :-) I've clarified the last step somewhat, just remember that with data.tables the first part (df[i, j]) can also be a join, try df[ nextblocksums, ] to see it in action.

– tstenner
Nov 16 '18 at 9:36

Thanks, it's working but it appears that my dataframe was a bit more complicated than just sum, in fact I need a time so I updated my first post. The time is the difference between last and first value in the range. Do you have an idea?

– Clément POIRET
Nov 14 '18 at 13:45

I managed to get it work by replacing sum by doing the difference between last and first line of a df[df$index == i]$Time in a For loop! Thanks!

– Clément POIRET
Nov 15 '18 at 14:58

That works, but it's slow for anything but tiny amounts of data. See my edit for a faster solution.

– tstenner
Nov 15 '18 at 16:04

Hmm pretty complicated ahah, I've troubles understanding this and I'm getting NA in LastBlocksSum

– Clément POIRET
Nov 15 '18 at 18:39

Of course there's a NA in the first LastBlocksSum, because there's no previous block for that one :-) I've clarified the last step somewhat, just remember that with data.tables the first part (df[i, j]) can also be a join, try df[ nextblocksums, ] to see it in action.

– tstenner
Nov 16 '18 at 9:36

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vfrdtyky