Calculation of a moving average using mysql leads to problems if there are gaps in the datasets












-1














My problem is that I try to calculate a moving average over some values from my table (one avg value for each row). It actually works but if it comes to gaps such as id[20,18,17] or date[2018-05-11,2018-05-9,2018-05-8] the calculation becomes wrong. I´m looking for a way to use a specific number of next rows to prevent this to happen.



The table contains id (auto_increment), date and close (Float).



This is my code:



CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT)
NO SQL
BEGIN
select hist_ask.id, hist_ask.date, hist_ask.close, round(avg(past.close),2) as mavg
from hist_ask
join hist_ask as past
on past.id between hist_ask.id - (periode-1) and hist_ask.id
group by hist_ask.id, hist_ask.close
ORDER BY hist_ask.id DESC
LIMIT 10;
END


The table I use looks like this



id , date       , close
20 , 2018-10-13 , 12086.5
19 , 2018-10-12 , 12002.2
17 , 2018-10-11 , 12007.0
and so on


The output looks like this:



The output I get from the query



Thanks in advance!










share|improve this question
























  • See: Why should I provide an MCVE for what seems to me to be a very simple SQL query?
    – Strawberry
    Nov 12 at 12:43










  • Please add some sample data to your question, specifically showing the cases where your current query is failing. Also, add the current query's output and expected output
    – Madhur Bhaiya
    Nov 12 at 12:44










  • the problem is it is actualy working but if there is a id missing it calculates wrong bacause it cannot find the right id. I couldn´t find any propper solution to just use for ech row the next 3 rows for the average instead of using a specific id.
    – rodarmy
    Nov 12 at 16:12
















-1














My problem is that I try to calculate a moving average over some values from my table (one avg value for each row). It actually works but if it comes to gaps such as id[20,18,17] or date[2018-05-11,2018-05-9,2018-05-8] the calculation becomes wrong. I´m looking for a way to use a specific number of next rows to prevent this to happen.



The table contains id (auto_increment), date and close (Float).



This is my code:



CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT)
NO SQL
BEGIN
select hist_ask.id, hist_ask.date, hist_ask.close, round(avg(past.close),2) as mavg
from hist_ask
join hist_ask as past
on past.id between hist_ask.id - (periode-1) and hist_ask.id
group by hist_ask.id, hist_ask.close
ORDER BY hist_ask.id DESC
LIMIT 10;
END


The table I use looks like this



id , date       , close
20 , 2018-10-13 , 12086.5
19 , 2018-10-12 , 12002.2
17 , 2018-10-11 , 12007.0
and so on


The output looks like this:



The output I get from the query



Thanks in advance!










share|improve this question
























  • See: Why should I provide an MCVE for what seems to me to be a very simple SQL query?
    – Strawberry
    Nov 12 at 12:43










  • Please add some sample data to your question, specifically showing the cases where your current query is failing. Also, add the current query's output and expected output
    – Madhur Bhaiya
    Nov 12 at 12:44










  • the problem is it is actualy working but if there is a id missing it calculates wrong bacause it cannot find the right id. I couldn´t find any propper solution to just use for ech row the next 3 rows for the average instead of using a specific id.
    – rodarmy
    Nov 12 at 16:12














-1












-1








-1







My problem is that I try to calculate a moving average over some values from my table (one avg value for each row). It actually works but if it comes to gaps such as id[20,18,17] or date[2018-05-11,2018-05-9,2018-05-8] the calculation becomes wrong. I´m looking for a way to use a specific number of next rows to prevent this to happen.



The table contains id (auto_increment), date and close (Float).



This is my code:



CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT)
NO SQL
BEGIN
select hist_ask.id, hist_ask.date, hist_ask.close, round(avg(past.close),2) as mavg
from hist_ask
join hist_ask as past
on past.id between hist_ask.id - (periode-1) and hist_ask.id
group by hist_ask.id, hist_ask.close
ORDER BY hist_ask.id DESC
LIMIT 10;
END


The table I use looks like this



id , date       , close
20 , 2018-10-13 , 12086.5
19 , 2018-10-12 , 12002.2
17 , 2018-10-11 , 12007.0
and so on


The output looks like this:



The output I get from the query



Thanks in advance!










share|improve this question















My problem is that I try to calculate a moving average over some values from my table (one avg value for each row). It actually works but if it comes to gaps such as id[20,18,17] or date[2018-05-11,2018-05-9,2018-05-8] the calculation becomes wrong. I´m looking for a way to use a specific number of next rows to prevent this to happen.



The table contains id (auto_increment), date and close (Float).



This is my code:



CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT)
NO SQL
BEGIN
select hist_ask.id, hist_ask.date, hist_ask.close, round(avg(past.close),2) as mavg
from hist_ask
join hist_ask as past
on past.id between hist_ask.id - (periode-1) and hist_ask.id
group by hist_ask.id, hist_ask.close
ORDER BY hist_ask.id DESC
LIMIT 10;
END


The table I use looks like this



id , date       , close
20 , 2018-10-13 , 12086.5
19 , 2018-10-12 , 12002.2
17 , 2018-10-11 , 12007.0
and so on


The output looks like this:



The output I get from the query



Thanks in advance!







mysql xampp moving-average






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 at 19:42









marc_s

569k12811001250




569k12811001250










asked Nov 12 at 12:40









rodarmy

12




12












  • See: Why should I provide an MCVE for what seems to me to be a very simple SQL query?
    – Strawberry
    Nov 12 at 12:43










  • Please add some sample data to your question, specifically showing the cases where your current query is failing. Also, add the current query's output and expected output
    – Madhur Bhaiya
    Nov 12 at 12:44










  • the problem is it is actualy working but if there is a id missing it calculates wrong bacause it cannot find the right id. I couldn´t find any propper solution to just use for ech row the next 3 rows for the average instead of using a specific id.
    – rodarmy
    Nov 12 at 16:12


















  • See: Why should I provide an MCVE for what seems to me to be a very simple SQL query?
    – Strawberry
    Nov 12 at 12:43










  • Please add some sample data to your question, specifically showing the cases where your current query is failing. Also, add the current query's output and expected output
    – Madhur Bhaiya
    Nov 12 at 12:44










  • the problem is it is actualy working but if there is a id missing it calculates wrong bacause it cannot find the right id. I couldn´t find any propper solution to just use for ech row the next 3 rows for the average instead of using a specific id.
    – rodarmy
    Nov 12 at 16:12
















See: Why should I provide an MCVE for what seems to me to be a very simple SQL query?
– Strawberry
Nov 12 at 12:43




See: Why should I provide an MCVE for what seems to me to be a very simple SQL query?
– Strawberry
Nov 12 at 12:43












Please add some sample data to your question, specifically showing the cases where your current query is failing. Also, add the current query's output and expected output
– Madhur Bhaiya
Nov 12 at 12:44




Please add some sample data to your question, specifically showing the cases where your current query is failing. Also, add the current query's output and expected output
– Madhur Bhaiya
Nov 12 at 12:44












the problem is it is actualy working but if there is a id missing it calculates wrong bacause it cannot find the right id. I couldn´t find any propper solution to just use for ech row the next 3 rows for the average instead of using a specific id.
– rodarmy
Nov 12 at 16:12




the problem is it is actualy working but if there is a id missing it calculates wrong bacause it cannot find the right id. I couldn´t find any propper solution to just use for ech row the next 3 rows for the average instead of using a specific id.
– rodarmy
Nov 12 at 16:12












1 Answer
1






active

oldest

votes


















0














I finaly make it work using a temporary table.
I can now give two parameters to the procedure:




  1. periode: the periode the moving average is calculated with

  2. _limit: limits the result set


Important for performance is the



ALTER TABLE temp
ENGINE=MyISAM;


statement because it reduces the execution time significantly. For example when proccessing 2000 rows it needs about 0.5 seconds, before adding it it needed about 6 seconds



Thats the code:



CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT, IN _limit INT)
NO SQL
BEGIN

DECLARE a FLOAT DEFAULT 0;
DECLARE i INT DEFAULT 0;
DECLARE count_limit INT DEFAULT 0;

SET @rn=0;
CREATE TEMPORARY TABLE IF NOT EXISTS temp (
SELECT
@rn:=@rn+1 AS pri_id,
date,
close , a AS
mavg
FROM hist_ask);

ALTER TABLE temp
ENGINE=MyISAM;

SET i=(SELECT pri_id FROM temp ORDER by pri_id DESC LIMIT 1);
SET count_limit= (i-_limit)-periode;


WHILE i>count_limit DO
SET a= (SELECT avg(close) FROM temp WHERE pri_id BETWEEN i-(periode-1) AND i);
UPDATE temp SET mavg=a WHERE pri_id=i;
SET i=i-1;
END WHILE;

SELECT pri_id,date,close,round(mavg,2) AS mavg FROM temp ORDER BY pri_id DESC LIMIT _limit;


DROP TABLE temp;

END


The result looks like that:



CALL `moving_avg`(3,5)



  • pri_id, date, close, mavg

  • 1999 2018-09-13 12086.6 12032.03

  • 1998 2018-09-11 12002.2 11983.47

  • 1997 2018-09-10 12007.3 11976.53

  • 1996 2018-09-07 11940.9 11993.80

  • 1995 2018-09-06 11981.4 12089.23


5 row(s) returned 0.047 sec / 0.000 sec






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53262401%2fcalculation-of-a-moving-average-using-mysql-leads-to-problems-if-there-are-gaps%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    I finaly make it work using a temporary table.
    I can now give two parameters to the procedure:




    1. periode: the periode the moving average is calculated with

    2. _limit: limits the result set


    Important for performance is the



    ALTER TABLE temp
    ENGINE=MyISAM;


    statement because it reduces the execution time significantly. For example when proccessing 2000 rows it needs about 0.5 seconds, before adding it it needed about 6 seconds



    Thats the code:



    CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT, IN _limit INT)
    NO SQL
    BEGIN

    DECLARE a FLOAT DEFAULT 0;
    DECLARE i INT DEFAULT 0;
    DECLARE count_limit INT DEFAULT 0;

    SET @rn=0;
    CREATE TEMPORARY TABLE IF NOT EXISTS temp (
    SELECT
    @rn:=@rn+1 AS pri_id,
    date,
    close , a AS
    mavg
    FROM hist_ask);

    ALTER TABLE temp
    ENGINE=MyISAM;

    SET i=(SELECT pri_id FROM temp ORDER by pri_id DESC LIMIT 1);
    SET count_limit= (i-_limit)-periode;


    WHILE i>count_limit DO
    SET a= (SELECT avg(close) FROM temp WHERE pri_id BETWEEN i-(periode-1) AND i);
    UPDATE temp SET mavg=a WHERE pri_id=i;
    SET i=i-1;
    END WHILE;

    SELECT pri_id,date,close,round(mavg,2) AS mavg FROM temp ORDER BY pri_id DESC LIMIT _limit;


    DROP TABLE temp;

    END


    The result looks like that:



    CALL `moving_avg`(3,5)



    • pri_id, date, close, mavg

    • 1999 2018-09-13 12086.6 12032.03

    • 1998 2018-09-11 12002.2 11983.47

    • 1997 2018-09-10 12007.3 11976.53

    • 1996 2018-09-07 11940.9 11993.80

    • 1995 2018-09-06 11981.4 12089.23


    5 row(s) returned 0.047 sec / 0.000 sec






    share|improve this answer




























      0














      I finaly make it work using a temporary table.
      I can now give two parameters to the procedure:




      1. periode: the periode the moving average is calculated with

      2. _limit: limits the result set


      Important for performance is the



      ALTER TABLE temp
      ENGINE=MyISAM;


      statement because it reduces the execution time significantly. For example when proccessing 2000 rows it needs about 0.5 seconds, before adding it it needed about 6 seconds



      Thats the code:



      CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT, IN _limit INT)
      NO SQL
      BEGIN

      DECLARE a FLOAT DEFAULT 0;
      DECLARE i INT DEFAULT 0;
      DECLARE count_limit INT DEFAULT 0;

      SET @rn=0;
      CREATE TEMPORARY TABLE IF NOT EXISTS temp (
      SELECT
      @rn:=@rn+1 AS pri_id,
      date,
      close , a AS
      mavg
      FROM hist_ask);

      ALTER TABLE temp
      ENGINE=MyISAM;

      SET i=(SELECT pri_id FROM temp ORDER by pri_id DESC LIMIT 1);
      SET count_limit= (i-_limit)-periode;


      WHILE i>count_limit DO
      SET a= (SELECT avg(close) FROM temp WHERE pri_id BETWEEN i-(periode-1) AND i);
      UPDATE temp SET mavg=a WHERE pri_id=i;
      SET i=i-1;
      END WHILE;

      SELECT pri_id,date,close,round(mavg,2) AS mavg FROM temp ORDER BY pri_id DESC LIMIT _limit;


      DROP TABLE temp;

      END


      The result looks like that:



      CALL `moving_avg`(3,5)



      • pri_id, date, close, mavg

      • 1999 2018-09-13 12086.6 12032.03

      • 1998 2018-09-11 12002.2 11983.47

      • 1997 2018-09-10 12007.3 11976.53

      • 1996 2018-09-07 11940.9 11993.80

      • 1995 2018-09-06 11981.4 12089.23


      5 row(s) returned 0.047 sec / 0.000 sec






      share|improve this answer


























        0












        0








        0






        I finaly make it work using a temporary table.
        I can now give two parameters to the procedure:




        1. periode: the periode the moving average is calculated with

        2. _limit: limits the result set


        Important for performance is the



        ALTER TABLE temp
        ENGINE=MyISAM;


        statement because it reduces the execution time significantly. For example when proccessing 2000 rows it needs about 0.5 seconds, before adding it it needed about 6 seconds



        Thats the code:



        CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT, IN _limit INT)
        NO SQL
        BEGIN

        DECLARE a FLOAT DEFAULT 0;
        DECLARE i INT DEFAULT 0;
        DECLARE count_limit INT DEFAULT 0;

        SET @rn=0;
        CREATE TEMPORARY TABLE IF NOT EXISTS temp (
        SELECT
        @rn:=@rn+1 AS pri_id,
        date,
        close , a AS
        mavg
        FROM hist_ask);

        ALTER TABLE temp
        ENGINE=MyISAM;

        SET i=(SELECT pri_id FROM temp ORDER by pri_id DESC LIMIT 1);
        SET count_limit= (i-_limit)-periode;


        WHILE i>count_limit DO
        SET a= (SELECT avg(close) FROM temp WHERE pri_id BETWEEN i-(periode-1) AND i);
        UPDATE temp SET mavg=a WHERE pri_id=i;
        SET i=i-1;
        END WHILE;

        SELECT pri_id,date,close,round(mavg,2) AS mavg FROM temp ORDER BY pri_id DESC LIMIT _limit;


        DROP TABLE temp;

        END


        The result looks like that:



        CALL `moving_avg`(3,5)



        • pri_id, date, close, mavg

        • 1999 2018-09-13 12086.6 12032.03

        • 1998 2018-09-11 12002.2 11983.47

        • 1997 2018-09-10 12007.3 11976.53

        • 1996 2018-09-07 11940.9 11993.80

        • 1995 2018-09-06 11981.4 12089.23


        5 row(s) returned 0.047 sec / 0.000 sec






        share|improve this answer














        I finaly make it work using a temporary table.
        I can now give two parameters to the procedure:




        1. periode: the periode the moving average is calculated with

        2. _limit: limits the result set


        Important for performance is the



        ALTER TABLE temp
        ENGINE=MyISAM;


        statement because it reduces the execution time significantly. For example when proccessing 2000 rows it needs about 0.5 seconds, before adding it it needed about 6 seconds



        Thats the code:



        CREATE DEFINER=`root`@`localhost` PROCEDURE `moving_avg`(IN periode INT, IN _limit INT)
        NO SQL
        BEGIN

        DECLARE a FLOAT DEFAULT 0;
        DECLARE i INT DEFAULT 0;
        DECLARE count_limit INT DEFAULT 0;

        SET @rn=0;
        CREATE TEMPORARY TABLE IF NOT EXISTS temp (
        SELECT
        @rn:=@rn+1 AS pri_id,
        date,
        close , a AS
        mavg
        FROM hist_ask);

        ALTER TABLE temp
        ENGINE=MyISAM;

        SET i=(SELECT pri_id FROM temp ORDER by pri_id DESC LIMIT 1);
        SET count_limit= (i-_limit)-periode;


        WHILE i>count_limit DO
        SET a= (SELECT avg(close) FROM temp WHERE pri_id BETWEEN i-(periode-1) AND i);
        UPDATE temp SET mavg=a WHERE pri_id=i;
        SET i=i-1;
        END WHILE;

        SELECT pri_id,date,close,round(mavg,2) AS mavg FROM temp ORDER BY pri_id DESC LIMIT _limit;


        DROP TABLE temp;

        END


        The result looks like that:



        CALL `moving_avg`(3,5)



        • pri_id, date, close, mavg

        • 1999 2018-09-13 12086.6 12032.03

        • 1998 2018-09-11 12002.2 11983.47

        • 1997 2018-09-10 12007.3 11976.53

        • 1996 2018-09-07 11940.9 11993.80

        • 1995 2018-09-06 11981.4 12089.23


        5 row(s) returned 0.047 sec / 0.000 sec







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 21 at 6:07

























        answered Nov 21 at 3:58









        rodarmy

        12




        12






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53262401%2fcalculation-of-a-moving-average-using-mysql-leads-to-problems-if-there-are-gaps%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Xamarin.iOS Cant Deploy on Iphone

            Glorious Revolution

            Dulmage-Mendelsohn matrix decomposition in Python