Return Calculations Incorrect in Panel Data
I'm currently working with panel data in Stata, and run the following commands to define the panel:
encode ticker, generate(ticker_n)
xtset ticker_n time
Where the ticker is a string (ticker of a listed company on a stock exchange), and time is an integer going from 930 (opening of the market) to 1559 (closing of the market). Thus, time here indicates the minutes the stock exchange is opened. For each minute the stock market is opened we have all close prices of the tickers listed at the stock exchange. A sample of the data looks as such:
date time open high low close volume ticker ticker_n
09/15/2008 930 33.31 33.31 33.31 33.31 2135 zeus zeus
09/15/2008 931 32.94 32.94 32.94 32.94 100 zeus zeus
09/15/2008 930 10.21 10.21 10.21 10.21 4270 bx bx
09/15/2008 931 10.46 10.5 10.42 10.44 5700 bx bx
Then, in an attempt to calculate returns (using the close price) I run the following command:
gen return = (close - l.close) / l.close
However, this leads to a weird error where every whole hour (time = 1100, 1200, 1300, etc.) the returns are not calculated at all and Stata just reports a "-" for the returns.
Now I assume something went wrong in defining the panel data, such that Stata does not recognize that the observation before 1500 should be 1459 (it looks for 1499 I assume?).
Hence, my question is, how do I correctly define my panel data such that Stata recognizes that my time axis is in minutes? I did not find anything in the official Stata documentation that helped me out here.
stata panel-data
add a comment |
I'm currently working with panel data in Stata, and run the following commands to define the panel:
encode ticker, generate(ticker_n)
xtset ticker_n time
Where the ticker is a string (ticker of a listed company on a stock exchange), and time is an integer going from 930 (opening of the market) to 1559 (closing of the market). Thus, time here indicates the minutes the stock exchange is opened. For each minute the stock market is opened we have all close prices of the tickers listed at the stock exchange. A sample of the data looks as such:
date time open high low close volume ticker ticker_n
09/15/2008 930 33.31 33.31 33.31 33.31 2135 zeus zeus
09/15/2008 931 32.94 32.94 32.94 32.94 100 zeus zeus
09/15/2008 930 10.21 10.21 10.21 10.21 4270 bx bx
09/15/2008 931 10.46 10.5 10.42 10.44 5700 bx bx
Then, in an attempt to calculate returns (using the close price) I run the following command:
gen return = (close - l.close) / l.close
However, this leads to a weird error where every whole hour (time = 1100, 1200, 1300, etc.) the returns are not calculated at all and Stata just reports a "-" for the returns.
Now I assume something went wrong in defining the panel data, such that Stata does not recognize that the observation before 1500 should be 1459 (it looks for 1499 I assume?).
Hence, my question is, how do I correctly define my panel data such that Stata recognizes that my time axis is in minutes? I did not find anything in the official Stata documentation that helped me out here.
stata panel-data
add a comment |
I'm currently working with panel data in Stata, and run the following commands to define the panel:
encode ticker, generate(ticker_n)
xtset ticker_n time
Where the ticker is a string (ticker of a listed company on a stock exchange), and time is an integer going from 930 (opening of the market) to 1559 (closing of the market). Thus, time here indicates the minutes the stock exchange is opened. For each minute the stock market is opened we have all close prices of the tickers listed at the stock exchange. A sample of the data looks as such:
date time open high low close volume ticker ticker_n
09/15/2008 930 33.31 33.31 33.31 33.31 2135 zeus zeus
09/15/2008 931 32.94 32.94 32.94 32.94 100 zeus zeus
09/15/2008 930 10.21 10.21 10.21 10.21 4270 bx bx
09/15/2008 931 10.46 10.5 10.42 10.44 5700 bx bx
Then, in an attempt to calculate returns (using the close price) I run the following command:
gen return = (close - l.close) / l.close
However, this leads to a weird error where every whole hour (time = 1100, 1200, 1300, etc.) the returns are not calculated at all and Stata just reports a "-" for the returns.
Now I assume something went wrong in defining the panel data, such that Stata does not recognize that the observation before 1500 should be 1459 (it looks for 1499 I assume?).
Hence, my question is, how do I correctly define my panel data such that Stata recognizes that my time axis is in minutes? I did not find anything in the official Stata documentation that helped me out here.
stata panel-data
I'm currently working with panel data in Stata, and run the following commands to define the panel:
encode ticker, generate(ticker_n)
xtset ticker_n time
Where the ticker is a string (ticker of a listed company on a stock exchange), and time is an integer going from 930 (opening of the market) to 1559 (closing of the market). Thus, time here indicates the minutes the stock exchange is opened. For each minute the stock market is opened we have all close prices of the tickers listed at the stock exchange. A sample of the data looks as such:
date time open high low close volume ticker ticker_n
09/15/2008 930 33.31 33.31 33.31 33.31 2135 zeus zeus
09/15/2008 931 32.94 32.94 32.94 32.94 100 zeus zeus
09/15/2008 930 10.21 10.21 10.21 10.21 4270 bx bx
09/15/2008 931 10.46 10.5 10.42 10.44 5700 bx bx
Then, in an attempt to calculate returns (using the close price) I run the following command:
gen return = (close - l.close) / l.close
However, this leads to a weird error where every whole hour (time = 1100, 1200, 1300, etc.) the returns are not calculated at all and Stata just reports a "-" for the returns.
Now I assume something went wrong in defining the panel data, such that Stata does not recognize that the observation before 1500 should be 1459 (it looks for 1499 I assume?).
Hence, my question is, how do I correctly define my panel data such that Stata recognizes that my time axis is in minutes? I did not find anything in the official Stata documentation that helped me out here.
stata panel-data
stata panel-data
edited Nov 14 '18 at 18:16
Nick Cox
25.1k42038
25.1k42038
asked Nov 14 '18 at 12:59
Menno Van DijkMenno Van Dijk
14911
14911
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Indeed: your time
variable is messing you up mightily. If time
is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.
You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.
clear
input time
930
931
959
1000
1001
1059
1100
end
gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)
format mytime %tcHH:MM
gen id = 1
xtset id mytime, delta(60000)
list mytime L.mytime, sep(0)
+-----------------+
| L.|
| mytime mytime |
|-----------------|
1. | 09:30 . |
2. | 09:31 09:30 |
3. | 09:59 . |
4. | 10:00 09:59 |
5. | 10:01 10:00 |
6. | 10:59 . |
7. | 11:00 10:59 |
+-----------------+
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53300837%2freturn-calculations-incorrect-in-panel-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Indeed: your time
variable is messing you up mightily. If time
is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.
You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.
clear
input time
930
931
959
1000
1001
1059
1100
end
gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)
format mytime %tcHH:MM
gen id = 1
xtset id mytime, delta(60000)
list mytime L.mytime, sep(0)
+-----------------+
| L.|
| mytime mytime |
|-----------------|
1. | 09:30 . |
2. | 09:31 09:30 |
3. | 09:59 . |
4. | 10:00 09:59 |
5. | 10:01 10:00 |
6. | 10:59 . |
7. | 11:00 10:59 |
+-----------------+
add a comment |
Indeed: your time
variable is messing you up mightily. If time
is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.
You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.
clear
input time
930
931
959
1000
1001
1059
1100
end
gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)
format mytime %tcHH:MM
gen id = 1
xtset id mytime, delta(60000)
list mytime L.mytime, sep(0)
+-----------------+
| L.|
| mytime mytime |
|-----------------|
1. | 09:30 . |
2. | 09:31 09:30 |
3. | 09:59 . |
4. | 10:00 09:59 |
5. | 10:01 10:00 |
6. | 10:59 . |
7. | 11:00 10:59 |
+-----------------+
add a comment |
Indeed: your time
variable is messing you up mightily. If time
is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.
You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.
clear
input time
930
931
959
1000
1001
1059
1100
end
gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)
format mytime %tcHH:MM
gen id = 1
xtset id mytime, delta(60000)
list mytime L.mytime, sep(0)
+-----------------+
| L.|
| mytime mytime |
|-----------------|
1. | 09:30 . |
2. | 09:31 09:30 |
3. | 09:59 . |
4. | 10:00 09:59 |
5. | 10:01 10:00 |
6. | 10:59 . |
7. | 11:00 10:59 |
+-----------------+
Indeed: your time
variable is messing you up mightily. If time
is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.
You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.
clear
input time
930
931
959
1000
1001
1059
1100
end
gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)
format mytime %tcHH:MM
gen id = 1
xtset id mytime, delta(60000)
list mytime L.mytime, sep(0)
+-----------------+
| L.|
| mytime mytime |
|-----------------|
1. | 09:30 . |
2. | 09:31 09:30 |
3. | 09:59 . |
4. | 10:00 09:59 |
5. | 10:01 10:00 |
6. | 10:59 . |
7. | 11:00 10:59 |
+-----------------+
answered Nov 14 '18 at 17:44
Nick CoxNick Cox
25.1k42038
25.1k42038
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53300837%2freturn-calculations-incorrect-in-panel-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown