Return Calculations Incorrect in Panel Data












0















I'm currently working with panel data in Stata, and run the following commands to define the panel:



encode ticker, generate(ticker_n)
xtset ticker_n time


Where the ticker is a string (ticker of a listed company on a stock exchange), and time is an integer going from 930 (opening of the market) to 1559 (closing of the market). Thus, time here indicates the minutes the stock exchange is opened. For each minute the stock market is opened we have all close prices of the tickers listed at the stock exchange. A sample of the data looks as such:



date        time    open    high    low     close   volume  ticker  ticker_n
09/15/2008 930 33.31 33.31 33.31 33.31 2135 zeus zeus
09/15/2008 931 32.94 32.94 32.94 32.94 100 zeus zeus
09/15/2008 930 10.21 10.21 10.21 10.21 4270 bx bx
09/15/2008 931 10.46 10.5 10.42 10.44 5700 bx bx


Then, in an attempt to calculate returns (using the close price) I run the following command:



gen return = (close - l.close) / l.close


However, this leads to a weird error where every whole hour (time = 1100, 1200, 1300, etc.) the returns are not calculated at all and Stata just reports a "-" for the returns.



Now I assume something went wrong in defining the panel data, such that Stata does not recognize that the observation before 1500 should be 1459 (it looks for 1499 I assume?).



Hence, my question is, how do I correctly define my panel data such that Stata recognizes that my time axis is in minutes? I did not find anything in the official Stata documentation that helped me out here.










share|improve this question





























    0















    I'm currently working with panel data in Stata, and run the following commands to define the panel:



    encode ticker, generate(ticker_n)
    xtset ticker_n time


    Where the ticker is a string (ticker of a listed company on a stock exchange), and time is an integer going from 930 (opening of the market) to 1559 (closing of the market). Thus, time here indicates the minutes the stock exchange is opened. For each minute the stock market is opened we have all close prices of the tickers listed at the stock exchange. A sample of the data looks as such:



    date        time    open    high    low     close   volume  ticker  ticker_n
    09/15/2008 930 33.31 33.31 33.31 33.31 2135 zeus zeus
    09/15/2008 931 32.94 32.94 32.94 32.94 100 zeus zeus
    09/15/2008 930 10.21 10.21 10.21 10.21 4270 bx bx
    09/15/2008 931 10.46 10.5 10.42 10.44 5700 bx bx


    Then, in an attempt to calculate returns (using the close price) I run the following command:



    gen return = (close - l.close) / l.close


    However, this leads to a weird error where every whole hour (time = 1100, 1200, 1300, etc.) the returns are not calculated at all and Stata just reports a "-" for the returns.



    Now I assume something went wrong in defining the panel data, such that Stata does not recognize that the observation before 1500 should be 1459 (it looks for 1499 I assume?).



    Hence, my question is, how do I correctly define my panel data such that Stata recognizes that my time axis is in minutes? I did not find anything in the official Stata documentation that helped me out here.










    share|improve this question



























      0












      0








      0








      I'm currently working with panel data in Stata, and run the following commands to define the panel:



      encode ticker, generate(ticker_n)
      xtset ticker_n time


      Where the ticker is a string (ticker of a listed company on a stock exchange), and time is an integer going from 930 (opening of the market) to 1559 (closing of the market). Thus, time here indicates the minutes the stock exchange is opened. For each minute the stock market is opened we have all close prices of the tickers listed at the stock exchange. A sample of the data looks as such:



      date        time    open    high    low     close   volume  ticker  ticker_n
      09/15/2008 930 33.31 33.31 33.31 33.31 2135 zeus zeus
      09/15/2008 931 32.94 32.94 32.94 32.94 100 zeus zeus
      09/15/2008 930 10.21 10.21 10.21 10.21 4270 bx bx
      09/15/2008 931 10.46 10.5 10.42 10.44 5700 bx bx


      Then, in an attempt to calculate returns (using the close price) I run the following command:



      gen return = (close - l.close) / l.close


      However, this leads to a weird error where every whole hour (time = 1100, 1200, 1300, etc.) the returns are not calculated at all and Stata just reports a "-" for the returns.



      Now I assume something went wrong in defining the panel data, such that Stata does not recognize that the observation before 1500 should be 1459 (it looks for 1499 I assume?).



      Hence, my question is, how do I correctly define my panel data such that Stata recognizes that my time axis is in minutes? I did not find anything in the official Stata documentation that helped me out here.










      share|improve this question
















      I'm currently working with panel data in Stata, and run the following commands to define the panel:



      encode ticker, generate(ticker_n)
      xtset ticker_n time


      Where the ticker is a string (ticker of a listed company on a stock exchange), and time is an integer going from 930 (opening of the market) to 1559 (closing of the market). Thus, time here indicates the minutes the stock exchange is opened. For each minute the stock market is opened we have all close prices of the tickers listed at the stock exchange. A sample of the data looks as such:



      date        time    open    high    low     close   volume  ticker  ticker_n
      09/15/2008 930 33.31 33.31 33.31 33.31 2135 zeus zeus
      09/15/2008 931 32.94 32.94 32.94 32.94 100 zeus zeus
      09/15/2008 930 10.21 10.21 10.21 10.21 4270 bx bx
      09/15/2008 931 10.46 10.5 10.42 10.44 5700 bx bx


      Then, in an attempt to calculate returns (using the close price) I run the following command:



      gen return = (close - l.close) / l.close


      However, this leads to a weird error where every whole hour (time = 1100, 1200, 1300, etc.) the returns are not calculated at all and Stata just reports a "-" for the returns.



      Now I assume something went wrong in defining the panel data, such that Stata does not recognize that the observation before 1500 should be 1459 (it looks for 1499 I assume?).



      Hence, my question is, how do I correctly define my panel data such that Stata recognizes that my time axis is in minutes? I did not find anything in the official Stata documentation that helped me out here.







      stata panel-data






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 14 '18 at 18:16









      Nick Cox

      25.1k42038




      25.1k42038










      asked Nov 14 '18 at 12:59









      Menno Van DijkMenno Van Dijk

      14911




      14911
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Indeed: your time variable is messing you up mightily. If time is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.



          You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.



          clear 
          input time
          930
          931
          959
          1000
          1001
          1059
          1100
          end

          gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)

          format mytime %tcHH:MM

          gen id = 1
          xtset id mytime, delta(60000)

          list mytime L.mytime, sep(0)

          +-----------------+
          | L.|
          | mytime mytime |
          |-----------------|
          1. | 09:30 . |
          2. | 09:31 09:30 |
          3. | 09:59 . |
          4. | 10:00 09:59 |
          5. | 10:01 10:00 |
          6. | 10:59 . |
          7. | 11:00 10:59 |
          +-----------------+





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53300837%2freturn-calculations-incorrect-in-panel-data%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            Indeed: your time variable is messing you up mightily. If time is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.



            You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.



            clear 
            input time
            930
            931
            959
            1000
            1001
            1059
            1100
            end

            gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)

            format mytime %tcHH:MM

            gen id = 1
            xtset id mytime, delta(60000)

            list mytime L.mytime, sep(0)

            +-----------------+
            | L.|
            | mytime mytime |
            |-----------------|
            1. | 09:30 . |
            2. | 09:31 09:30 |
            3. | 09:59 . |
            4. | 10:00 09:59 |
            5. | 10:01 10:00 |
            6. | 10:59 . |
            7. | 11:00 10:59 |
            +-----------------+





            share|improve this answer




























              0














              Indeed: your time variable is messing you up mightily. If time is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.



              You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.



              clear 
              input time
              930
              931
              959
              1000
              1001
              1059
              1100
              end

              gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)

              format mytime %tcHH:MM

              gen id = 1
              xtset id mytime, delta(60000)

              list mytime L.mytime, sep(0)

              +-----------------+
              | L.|
              | mytime mytime |
              |-----------------|
              1. | 09:30 . |
              2. | 09:31 09:30 |
              3. | 09:59 . |
              4. | 10:00 09:59 |
              5. | 10:01 10:00 |
              6. | 10:59 . |
              7. | 11:00 10:59 |
              +-----------------+





              share|improve this answer


























                0












                0








                0







                Indeed: your time variable is messing you up mightily. If time is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.



                You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.



                clear 
                input time
                930
                931
                959
                1000
                1001
                1059
                1100
                end

                gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)

                format mytime %tcHH:MM

                gen id = 1
                xtset id mytime, delta(60000)

                list mytime L.mytime, sep(0)

                +-----------------+
                | L.|
                | mytime mytime |
                |-----------------|
                1. | 09:30 . |
                2. | 09:31 09:30 |
                3. | 09:59 . |
                4. | 10:00 09:59 |
                5. | 10:01 10:00 |
                6. | 10:59 . |
                7. | 11:00 10:59 |
                +-----------------+





                share|improve this answer













                Indeed: your time variable is messing you up mightily. If time is going from 1059 to 1100, or from 1159 to 1200, each of those is a jump of 41 to Stata. The value for the time previous to 1100 would have been at time 1099, which won't be in your data; hence previous values for 1100, etc., will all be missing. There is no sense whatsoever in which Stata will look at 1100 and say "Oh! that's a time and so the previous time would have been 1059 and I should use the value for 1059". Applying a time display format wouldn't change that failure to see the times as you understand them.



                You don't explain how daily dates are supposed to enter your analysis. Here is some technique for times in hours and minutes alone.



                clear 
                input time
                930
                931
                959
                1000
                1001
                1059
                1100
                end

                gen double mytime = dhms(0, floor(time/100), mod(time, 100), 0)

                format mytime %tcHH:MM

                gen id = 1
                xtset id mytime, delta(60000)

                list mytime L.mytime, sep(0)

                +-----------------+
                | L.|
                | mytime mytime |
                |-----------------|
                1. | 09:30 . |
                2. | 09:31 09:30 |
                3. | 09:59 . |
                4. | 10:00 09:59 |
                5. | 10:01 10:00 |
                6. | 10:59 . |
                7. | 11:00 10:59 |
                +-----------------+






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 14 '18 at 17:44









                Nick CoxNick Cox

                25.1k42038




                25.1k42038
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53300837%2freturn-calculations-incorrect-in-panel-data%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Xamarin.iOS Cant Deploy on Iphone

                    Glorious Revolution

                    Dulmage-Mendelsohn matrix decomposition in Python