Converting dataframe with multiple values for one date into a ts object in R
I have a large dataset with multiple values for specific days. There are missing values in the dataset as it's for a long period of time. Here's a small example:
set.seed(1)
data <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10", "1993-08-11", "1993-08-11", "1993-08-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data$Date <- as.Date(data$Date)
I want to convert this dataframe into a ts object, so that I can forecast, use arima models, and eventually find outliers.
It specifically needs to be a ts object and not a xts object.
The problem I'm facing is:
1) I don't know how to convert a data frame into a ts object.
2) Create a ts object that allows for multiple values to take place for a single day.
Any help would be greatly appreciated. Thank you!
r dataframe type-conversion time-series
add a comment |
I have a large dataset with multiple values for specific days. There are missing values in the dataset as it's for a long period of time. Here's a small example:
set.seed(1)
data <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10", "1993-08-11", "1993-08-11", "1993-08-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data$Date <- as.Date(data$Date)
I want to convert this dataframe into a ts object, so that I can forecast, use arima models, and eventually find outliers.
It specifically needs to be a ts object and not a xts object.
The problem I'm facing is:
1) I don't know how to convert a data frame into a ts object.
2) Create a ts object that allows for multiple values to take place for a single day.
Any help would be greatly appreciated. Thank you!
r dataframe type-conversion time-series
3
You're going to end up with a lot ofNA
values to represent this as ats
/mts
class object, since you don't have evenly spaced data. Is that okay?
– thelatemail
Nov 15 '18 at 22:09
@thelatemail Is there an alternative that would not end up with NA values? If not, I'll try my chances with the the version with the NAs.
– SecretBeach
Nov 15 '18 at 22:19
1
Does your data indicate the Oxygen values occurred at different times on the same day (long data) or that they represent different measurements/columns for the same date (wide data)? Could you provide an example of the structure of the output you need?
– dmca
Nov 15 '18 at 22:58
@dmca They represent different measurements/columns for the same date. The output I need is simply for my data to be identifiable as a ts object so that when I use an outlier detection package (tsoutliers), it will be able to run the object. The package only recognizes time series and not data frames.
– SecretBeach
Nov 15 '18 at 23:07
1
If you want to detect outliers in Oxygen then each measurement of Oxygen needs have occurred at a different point in time. Because your data is keyed by date and not datetime, there is no way to distinguish between measurements on the same day. You either need to pick one measurement per day, aggregate them somehow (as GG suggested), or have multiple time series of Oxygen with different sets of outliers (some of which will have NAs).
– dmca
Nov 15 '18 at 23:20
add a comment |
I have a large dataset with multiple values for specific days. There are missing values in the dataset as it's for a long period of time. Here's a small example:
set.seed(1)
data <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10", "1993-08-11", "1993-08-11", "1993-08-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data$Date <- as.Date(data$Date)
I want to convert this dataframe into a ts object, so that I can forecast, use arima models, and eventually find outliers.
It specifically needs to be a ts object and not a xts object.
The problem I'm facing is:
1) I don't know how to convert a data frame into a ts object.
2) Create a ts object that allows for multiple values to take place for a single day.
Any help would be greatly appreciated. Thank you!
r dataframe type-conversion time-series
I have a large dataset with multiple values for specific days. There are missing values in the dataset as it's for a long period of time. Here's a small example:
set.seed(1)
data <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10", "1993-08-11", "1993-08-11", "1993-08-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data$Date <- as.Date(data$Date)
I want to convert this dataframe into a ts object, so that I can forecast, use arima models, and eventually find outliers.
It specifically needs to be a ts object and not a xts object.
The problem I'm facing is:
1) I don't know how to convert a data frame into a ts object.
2) Create a ts object that allows for multiple values to take place for a single day.
Any help would be greatly appreciated. Thank you!
r dataframe type-conversion time-series
r dataframe type-conversion time-series
edited Nov 15 '18 at 22:05
markus
14.4k11336
14.4k11336
asked Nov 15 '18 at 21:56
SecretBeachSecretBeach
848
848
3
You're going to end up with a lot ofNA
values to represent this as ats
/mts
class object, since you don't have evenly spaced data. Is that okay?
– thelatemail
Nov 15 '18 at 22:09
@thelatemail Is there an alternative that would not end up with NA values? If not, I'll try my chances with the the version with the NAs.
– SecretBeach
Nov 15 '18 at 22:19
1
Does your data indicate the Oxygen values occurred at different times on the same day (long data) or that they represent different measurements/columns for the same date (wide data)? Could you provide an example of the structure of the output you need?
– dmca
Nov 15 '18 at 22:58
@dmca They represent different measurements/columns for the same date. The output I need is simply for my data to be identifiable as a ts object so that when I use an outlier detection package (tsoutliers), it will be able to run the object. The package only recognizes time series and not data frames.
– SecretBeach
Nov 15 '18 at 23:07
1
If you want to detect outliers in Oxygen then each measurement of Oxygen needs have occurred at a different point in time. Because your data is keyed by date and not datetime, there is no way to distinguish between measurements on the same day. You either need to pick one measurement per day, aggregate them somehow (as GG suggested), or have multiple time series of Oxygen with different sets of outliers (some of which will have NAs).
– dmca
Nov 15 '18 at 23:20
add a comment |
3
You're going to end up with a lot ofNA
values to represent this as ats
/mts
class object, since you don't have evenly spaced data. Is that okay?
– thelatemail
Nov 15 '18 at 22:09
@thelatemail Is there an alternative that would not end up with NA values? If not, I'll try my chances with the the version with the NAs.
– SecretBeach
Nov 15 '18 at 22:19
1
Does your data indicate the Oxygen values occurred at different times on the same day (long data) or that they represent different measurements/columns for the same date (wide data)? Could you provide an example of the structure of the output you need?
– dmca
Nov 15 '18 at 22:58
@dmca They represent different measurements/columns for the same date. The output I need is simply for my data to be identifiable as a ts object so that when I use an outlier detection package (tsoutliers), it will be able to run the object. The package only recognizes time series and not data frames.
– SecretBeach
Nov 15 '18 at 23:07
1
If you want to detect outliers in Oxygen then each measurement of Oxygen needs have occurred at a different point in time. Because your data is keyed by date and not datetime, there is no way to distinguish between measurements on the same day. You either need to pick one measurement per day, aggregate them somehow (as GG suggested), or have multiple time series of Oxygen with different sets of outliers (some of which will have NAs).
– dmca
Nov 15 '18 at 23:20
3
3
You're going to end up with a lot of
NA
values to represent this as a ts
/mts
class object, since you don't have evenly spaced data. Is that okay?– thelatemail
Nov 15 '18 at 22:09
You're going to end up with a lot of
NA
values to represent this as a ts
/mts
class object, since you don't have evenly spaced data. Is that okay?– thelatemail
Nov 15 '18 at 22:09
@thelatemail Is there an alternative that would not end up with NA values? If not, I'll try my chances with the the version with the NAs.
– SecretBeach
Nov 15 '18 at 22:19
@thelatemail Is there an alternative that would not end up with NA values? If not, I'll try my chances with the the version with the NAs.
– SecretBeach
Nov 15 '18 at 22:19
1
1
Does your data indicate the Oxygen values occurred at different times on the same day (long data) or that they represent different measurements/columns for the same date (wide data)? Could you provide an example of the structure of the output you need?
– dmca
Nov 15 '18 at 22:58
Does your data indicate the Oxygen values occurred at different times on the same day (long data) or that they represent different measurements/columns for the same date (wide data)? Could you provide an example of the structure of the output you need?
– dmca
Nov 15 '18 at 22:58
@dmca They represent different measurements/columns for the same date. The output I need is simply for my data to be identifiable as a ts object so that when I use an outlier detection package (tsoutliers), it will be able to run the object. The package only recognizes time series and not data frames.
– SecretBeach
Nov 15 '18 at 23:07
@dmca They represent different measurements/columns for the same date. The output I need is simply for my data to be identifiable as a ts object so that when I use an outlier detection package (tsoutliers), it will be able to run the object. The package only recognizes time series and not data frames.
– SecretBeach
Nov 15 '18 at 23:07
1
1
If you want to detect outliers in Oxygen then each measurement of Oxygen needs have occurred at a different point in time. Because your data is keyed by date and not datetime, there is no way to distinguish between measurements on the same day. You either need to pick one measurement per day, aggregate them somehow (as GG suggested), or have multiple time series of Oxygen with different sets of outliers (some of which will have NAs).
– dmca
Nov 15 '18 at 23:20
If you want to detect outliers in Oxygen then each measurement of Oxygen needs have occurred at a different point in time. Because your data is keyed by date and not datetime, there is no way to distinguish between measurements on the same day. You either need to pick one measurement per day, aggregate them somehow (as GG suggested), or have multiple time series of Oxygen with different sets of outliers (some of which will have NAs).
– dmca
Nov 15 '18 at 23:20
add a comment |
1 Answer
1
active
oldest
votes
(1) mts ts
objects must be regularly spaced (i.e. the same amount of time between each successive point) and can't represent dates (but we can use numbers) so we assume that the August dates were meant to be July so that we have consecutive dates and we use the number of days since the Epoch (January 1, 1970) as the time.
Add a sequence number to distinguish equal dates and split the series into multiple columns:
library(zoo)
data3 <- transform(data2, seq = ave(1:nrow(data2), Date, FUN = seq_along))
z <- read.zoo(data3, index = "Date", split = "seq")
as.ts(z)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
1 2 3
8590 0.5 0.4 NA
8591 0.4 NA NA
8592 0.2 0.2 0.4
(2) mean Alternately average the values on equal dates:
z2 <- read.zoo(data2, index = "Date", aggregate = mean)
as.ts(z2)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
[1] 0.4500000 0.4000000 0.2666667
(3) Ignore Date We could ignore the Date column (as the poster suggested) in which case we just use 1, 2, 3, ... as the time index:
ts(data$Oxygen)
(4) 1st point each month Since, in a comment, the poster indicated that there is a lot of data (20 years) we could take the first point in each month forming a monthly series.
as.ts(read.zoo(data, index = "Date", FUN = as.yearmon, aggregate = function(x) x[1]))
Note
August dates have been changed to July to form data2
above:
set.seed(1)
data2 <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10",
"1993-07-11", "1993-07-11", "1993-07-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data2$Date <- as.Date(data$Date)
Since my data isn't evenly spaced for each day, is it possible to just use the Oxygen column and create a ts object out of that, and make up days for it? This might solve the spacing problem right?
– SecretBeach
Nov 15 '18 at 22:49
My dataset is over 20+ years long and will have inconsistencies all through it, since I want each point to be taken into consideration when finding outliers and therefore can't average values.
– SecretBeach
Nov 15 '18 at 22:52
This works! My data isn't elegant or conducive to ts objects, but I want to thank you for your time!
– SecretBeach
Nov 15 '18 at 23:36
1
Have moved comments to answer.
– G. Grothendieck
Nov 15 '18 at 23:49
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53328450%2fconverting-dataframe-with-multiple-values-for-one-date-into-a-ts-object-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
(1) mts ts
objects must be regularly spaced (i.e. the same amount of time between each successive point) and can't represent dates (but we can use numbers) so we assume that the August dates were meant to be July so that we have consecutive dates and we use the number of days since the Epoch (January 1, 1970) as the time.
Add a sequence number to distinguish equal dates and split the series into multiple columns:
library(zoo)
data3 <- transform(data2, seq = ave(1:nrow(data2), Date, FUN = seq_along))
z <- read.zoo(data3, index = "Date", split = "seq")
as.ts(z)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
1 2 3
8590 0.5 0.4 NA
8591 0.4 NA NA
8592 0.2 0.2 0.4
(2) mean Alternately average the values on equal dates:
z2 <- read.zoo(data2, index = "Date", aggregate = mean)
as.ts(z2)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
[1] 0.4500000 0.4000000 0.2666667
(3) Ignore Date We could ignore the Date column (as the poster suggested) in which case we just use 1, 2, 3, ... as the time index:
ts(data$Oxygen)
(4) 1st point each month Since, in a comment, the poster indicated that there is a lot of data (20 years) we could take the first point in each month forming a monthly series.
as.ts(read.zoo(data, index = "Date", FUN = as.yearmon, aggregate = function(x) x[1]))
Note
August dates have been changed to July to form data2
above:
set.seed(1)
data2 <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10",
"1993-07-11", "1993-07-11", "1993-07-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data2$Date <- as.Date(data$Date)
Since my data isn't evenly spaced for each day, is it possible to just use the Oxygen column and create a ts object out of that, and make up days for it? This might solve the spacing problem right?
– SecretBeach
Nov 15 '18 at 22:49
My dataset is over 20+ years long and will have inconsistencies all through it, since I want each point to be taken into consideration when finding outliers and therefore can't average values.
– SecretBeach
Nov 15 '18 at 22:52
This works! My data isn't elegant or conducive to ts objects, but I want to thank you for your time!
– SecretBeach
Nov 15 '18 at 23:36
1
Have moved comments to answer.
– G. Grothendieck
Nov 15 '18 at 23:49
add a comment |
(1) mts ts
objects must be regularly spaced (i.e. the same amount of time between each successive point) and can't represent dates (but we can use numbers) so we assume that the August dates were meant to be July so that we have consecutive dates and we use the number of days since the Epoch (January 1, 1970) as the time.
Add a sequence number to distinguish equal dates and split the series into multiple columns:
library(zoo)
data3 <- transform(data2, seq = ave(1:nrow(data2), Date, FUN = seq_along))
z <- read.zoo(data3, index = "Date", split = "seq")
as.ts(z)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
1 2 3
8590 0.5 0.4 NA
8591 0.4 NA NA
8592 0.2 0.2 0.4
(2) mean Alternately average the values on equal dates:
z2 <- read.zoo(data2, index = "Date", aggregate = mean)
as.ts(z2)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
[1] 0.4500000 0.4000000 0.2666667
(3) Ignore Date We could ignore the Date column (as the poster suggested) in which case we just use 1, 2, 3, ... as the time index:
ts(data$Oxygen)
(4) 1st point each month Since, in a comment, the poster indicated that there is a lot of data (20 years) we could take the first point in each month forming a monthly series.
as.ts(read.zoo(data, index = "Date", FUN = as.yearmon, aggregate = function(x) x[1]))
Note
August dates have been changed to July to form data2
above:
set.seed(1)
data2 <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10",
"1993-07-11", "1993-07-11", "1993-07-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data2$Date <- as.Date(data$Date)
Since my data isn't evenly spaced for each day, is it possible to just use the Oxygen column and create a ts object out of that, and make up days for it? This might solve the spacing problem right?
– SecretBeach
Nov 15 '18 at 22:49
My dataset is over 20+ years long and will have inconsistencies all through it, since I want each point to be taken into consideration when finding outliers and therefore can't average values.
– SecretBeach
Nov 15 '18 at 22:52
This works! My data isn't elegant or conducive to ts objects, but I want to thank you for your time!
– SecretBeach
Nov 15 '18 at 23:36
1
Have moved comments to answer.
– G. Grothendieck
Nov 15 '18 at 23:49
add a comment |
(1) mts ts
objects must be regularly spaced (i.e. the same amount of time between each successive point) and can't represent dates (but we can use numbers) so we assume that the August dates were meant to be July so that we have consecutive dates and we use the number of days since the Epoch (January 1, 1970) as the time.
Add a sequence number to distinguish equal dates and split the series into multiple columns:
library(zoo)
data3 <- transform(data2, seq = ave(1:nrow(data2), Date, FUN = seq_along))
z <- read.zoo(data3, index = "Date", split = "seq")
as.ts(z)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
1 2 3
8590 0.5 0.4 NA
8591 0.4 NA NA
8592 0.2 0.2 0.4
(2) mean Alternately average the values on equal dates:
z2 <- read.zoo(data2, index = "Date", aggregate = mean)
as.ts(z2)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
[1] 0.4500000 0.4000000 0.2666667
(3) Ignore Date We could ignore the Date column (as the poster suggested) in which case we just use 1, 2, 3, ... as the time index:
ts(data$Oxygen)
(4) 1st point each month Since, in a comment, the poster indicated that there is a lot of data (20 years) we could take the first point in each month forming a monthly series.
as.ts(read.zoo(data, index = "Date", FUN = as.yearmon, aggregate = function(x) x[1]))
Note
August dates have been changed to July to form data2
above:
set.seed(1)
data2 <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10",
"1993-07-11", "1993-07-11", "1993-07-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data2$Date <- as.Date(data$Date)
(1) mts ts
objects must be regularly spaced (i.e. the same amount of time between each successive point) and can't represent dates (but we can use numbers) so we assume that the August dates were meant to be July so that we have consecutive dates and we use the number of days since the Epoch (January 1, 1970) as the time.
Add a sequence number to distinguish equal dates and split the series into multiple columns:
library(zoo)
data3 <- transform(data2, seq = ave(1:nrow(data2), Date, FUN = seq_along))
z <- read.zoo(data3, index = "Date", split = "seq")
as.ts(z)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
1 2 3
8590 0.5 0.4 NA
8591 0.4 NA NA
8592 0.2 0.2 0.4
(2) mean Alternately average the values on equal dates:
z2 <- read.zoo(data2, index = "Date", aggregate = mean)
as.ts(z2)
giving:
Time Series:
Start = 8590
End = 8592
Frequency = 1
[1] 0.4500000 0.4000000 0.2666667
(3) Ignore Date We could ignore the Date column (as the poster suggested) in which case we just use 1, 2, 3, ... as the time index:
ts(data$Oxygen)
(4) 1st point each month Since, in a comment, the poster indicated that there is a lot of data (20 years) we could take the first point in each month forming a monthly series.
as.ts(read.zoo(data, index = "Date", FUN = as.yearmon, aggregate = function(x) x[1]))
Note
August dates have been changed to July to form data2
above:
set.seed(1)
data2 <- data.frame(
Date = sample(c("1993-07-09", "1993-07-09", "1993-07-10",
"1993-07-11", "1993-07-11", "1993-07-11")),
Oxygen = sample(c(0.2, 0.4, 0.4, 0.2, 0.4, 0.5))
)
data2$Date <- as.Date(data$Date)
edited Nov 19 '18 at 13:22
answered Nov 15 '18 at 22:27
G. GrothendieckG. Grothendieck
152k10134242
152k10134242
Since my data isn't evenly spaced for each day, is it possible to just use the Oxygen column and create a ts object out of that, and make up days for it? This might solve the spacing problem right?
– SecretBeach
Nov 15 '18 at 22:49
My dataset is over 20+ years long and will have inconsistencies all through it, since I want each point to be taken into consideration when finding outliers and therefore can't average values.
– SecretBeach
Nov 15 '18 at 22:52
This works! My data isn't elegant or conducive to ts objects, but I want to thank you for your time!
– SecretBeach
Nov 15 '18 at 23:36
1
Have moved comments to answer.
– G. Grothendieck
Nov 15 '18 at 23:49
add a comment |
Since my data isn't evenly spaced for each day, is it possible to just use the Oxygen column and create a ts object out of that, and make up days for it? This might solve the spacing problem right?
– SecretBeach
Nov 15 '18 at 22:49
My dataset is over 20+ years long and will have inconsistencies all through it, since I want each point to be taken into consideration when finding outliers and therefore can't average values.
– SecretBeach
Nov 15 '18 at 22:52
This works! My data isn't elegant or conducive to ts objects, but I want to thank you for your time!
– SecretBeach
Nov 15 '18 at 23:36
1
Have moved comments to answer.
– G. Grothendieck
Nov 15 '18 at 23:49
Since my data isn't evenly spaced for each day, is it possible to just use the Oxygen column and create a ts object out of that, and make up days for it? This might solve the spacing problem right?
– SecretBeach
Nov 15 '18 at 22:49
Since my data isn't evenly spaced for each day, is it possible to just use the Oxygen column and create a ts object out of that, and make up days for it? This might solve the spacing problem right?
– SecretBeach
Nov 15 '18 at 22:49
My dataset is over 20+ years long and will have inconsistencies all through it, since I want each point to be taken into consideration when finding outliers and therefore can't average values.
– SecretBeach
Nov 15 '18 at 22:52
My dataset is over 20+ years long and will have inconsistencies all through it, since I want each point to be taken into consideration when finding outliers and therefore can't average values.
– SecretBeach
Nov 15 '18 at 22:52
This works! My data isn't elegant or conducive to ts objects, but I want to thank you for your time!
– SecretBeach
Nov 15 '18 at 23:36
This works! My data isn't elegant or conducive to ts objects, but I want to thank you for your time!
– SecretBeach
Nov 15 '18 at 23:36
1
1
Have moved comments to answer.
– G. Grothendieck
Nov 15 '18 at 23:49
Have moved comments to answer.
– G. Grothendieck
Nov 15 '18 at 23:49
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53328450%2fconverting-dataframe-with-multiple-values-for-one-date-into-a-ts-object-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
You're going to end up with a lot of
NA
values to represent this as ats
/mts
class object, since you don't have evenly spaced data. Is that okay?– thelatemail
Nov 15 '18 at 22:09
@thelatemail Is there an alternative that would not end up with NA values? If not, I'll try my chances with the the version with the NAs.
– SecretBeach
Nov 15 '18 at 22:19
1
Does your data indicate the Oxygen values occurred at different times on the same day (long data) or that they represent different measurements/columns for the same date (wide data)? Could you provide an example of the structure of the output you need?
– dmca
Nov 15 '18 at 22:58
@dmca They represent different measurements/columns for the same date. The output I need is simply for my data to be identifiable as a ts object so that when I use an outlier detection package (tsoutliers), it will be able to run the object. The package only recognizes time series and not data frames.
– SecretBeach
Nov 15 '18 at 23:07
1
If you want to detect outliers in Oxygen then each measurement of Oxygen needs have occurred at a different point in time. Because your data is keyed by date and not datetime, there is no way to distinguish between measurements on the same day. You either need to pick one measurement per day, aggregate them somehow (as GG suggested), or have multiple time series of Oxygen with different sets of outliers (some of which will have NAs).
– dmca
Nov 15 '18 at 23:20