Find difference between max and second max, grouped by columns in MySql (8.0.13), without using the order by...





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







1















I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:



table name: ads

ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700


We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id



So the result is



ID |ad_id | diff_amount_time
1 3 200


I can get the max and second_max of the total table by executing:



select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time


However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by and limit 2 to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.










share|improve this question




















  • 1





    What is your MySQL server version ?

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:24











  • Just edited it in the title. Thanks for the reminder.

    – Kevin Sun
    Nov 16 '18 at 19:26











  • How do you get 200?

    – Eric
    Nov 16 '18 at 19:29






  • 1





    @KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:52






  • 1





    Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:53


















1















I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:



table name: ads

ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700


We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id



So the result is



ID |ad_id | diff_amount_time
1 3 200


I can get the max and second_max of the total table by executing:



select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time


However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by and limit 2 to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.










share|improve this question




















  • 1





    What is your MySQL server version ?

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:24











  • Just edited it in the title. Thanks for the reminder.

    – Kevin Sun
    Nov 16 '18 at 19:26











  • How do you get 200?

    – Eric
    Nov 16 '18 at 19:29






  • 1





    @KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:52






  • 1





    Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:53














1












1








1








I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:



table name: ads

ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700


We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id



So the result is



ID |ad_id | diff_amount_time
1 3 200


I can get the max and second_max of the total table by executing:



select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time


However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by and limit 2 to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.










share|improve this question
















I have a table with the columns ID, ad_id, amount_time, which shows the ID of users, the ad that they saw, and how much time they seen it. Here is an example of the data:



table name: ads

ID | ad_id | amount_time
1 2 600
1 3 300
3 1 400
1 3 100
1 1 700


We want the result to show the difference between max and 2nd max of amount_time, grouped by ID and ad_id



So the result is



ID |ad_id | diff_amount_time
1 3 200


I can get the max and second_max of the total table by executing:



select
(SELECT MAX(amount_time) FROM ads) maxtime,
(SELECT MAX(amount_time) FROM ads
WHERE amount_time NOT IN (SELECT MAX(amount_time) FROM ads )) as
second_max_time


However, I am having trouble incorporating the group by clause in order to get my result. I know there's a way to incorporate order by and limit 2 to get the max and and second max, but this is computationally expensive and want to know if there is another solution without ordering the amount_time column.







mysql group-by max






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 16 '18 at 19:26







Kevin Sun

















asked Nov 16 '18 at 18:39









Kevin SunKevin Sun

12318




12318








  • 1





    What is your MySQL server version ?

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:24











  • Just edited it in the title. Thanks for the reminder.

    – Kevin Sun
    Nov 16 '18 at 19:26











  • How do you get 200?

    – Eric
    Nov 16 '18 at 19:29






  • 1





    @KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:52






  • 1





    Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:53














  • 1





    What is your MySQL server version ?

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:24











  • Just edited it in the title. Thanks for the reminder.

    – Kevin Sun
    Nov 16 '18 at 19:26











  • How do you get 200?

    – Eric
    Nov 16 '18 at 19:29






  • 1





    @KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:52






  • 1





    Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.

    – Chowkidar Madhur Bhaiya
    Nov 16 '18 at 19:53








1




1





What is your MySQL server version ?

– Chowkidar Madhur Bhaiya
Nov 16 '18 at 19:24





What is your MySQL server version ?

– Chowkidar Madhur Bhaiya
Nov 16 '18 at 19:24













Just edited it in the title. Thanks for the reminder.

– Kevin Sun
Nov 16 '18 at 19:26





Just edited it in the title. Thanks for the reminder.

– Kevin Sun
Nov 16 '18 at 19:26













How do you get 200?

– Eric
Nov 16 '18 at 19:29





How do you get 200?

– Eric
Nov 16 '18 at 19:29




1




1





@KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html

– Chowkidar Madhur Bhaiya
Nov 16 '18 at 19:52





@KevinSun MAX() function will also do a full table scan if there is no indexing. Same thing is applicable for ORDER BY. MySQL has significantly optimized usage of ORDER BY. Would recommend you to read this article: dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html

– Chowkidar Madhur Bhaiya
Nov 16 '18 at 19:52




1




1





Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.

– Chowkidar Madhur Bhaiya
Nov 16 '18 at 19:53





Infact, problem with your approach would be usage of multiple subqueries, and multiple MAX() function calls. This will be much more inefficient. You can analyze EXPLAIN statement results for the same.

– Chowkidar Madhur Bhaiya
Nov 16 '18 at 19:53












1 Answer
1






active

oldest

votes


















1














In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



SELECT
dt.ID,
dt.ad_id,
(MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
FROM
(
SELECT
ID,
ad_id,
amount_time,
ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
ORDER BY amount_desc) AS row_no
FROM ads
) AS dt
GROUP BY dt.ID, dt.ad_id
-- to remove cases where there is no second highest
-- when there is no second highest amount, then the difference will be null
-- because 5 - null = null
HAVING diff_amount_time IS NOT NULL





share|improve this answer


























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343625%2ffind-difference-between-max-and-second-max-grouped-by-columns-in-mysql-8-0-13%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



    We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



    Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



    SELECT
    dt.ID,
    dt.ad_id,
    (MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
    MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
    FROM
    (
    SELECT
    ID,
    ad_id,
    amount_time,
    ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
    ORDER BY amount_desc) AS row_no
    FROM ads
    ) AS dt
    GROUP BY dt.ID, dt.ad_id
    -- to remove cases where there is no second highest
    -- when there is no second highest amount, then the difference will be null
    -- because 5 - null = null
    HAVING diff_amount_time IS NOT NULL





    share|improve this answer






























      1














      In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



      We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



      Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



      SELECT
      dt.ID,
      dt.ad_id,
      (MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
      MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
      FROM
      (
      SELECT
      ID,
      ad_id,
      amount_time,
      ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
      ORDER BY amount_desc) AS row_no
      FROM ads
      ) AS dt
      GROUP BY dt.ID, dt.ad_id
      -- to remove cases where there is no second highest
      -- when there is no second highest amount, then the difference will be null
      -- because 5 - null = null
      HAVING diff_amount_time IS NOT NULL





      share|improve this answer




























        1












        1








        1







        In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



        We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



        Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



        SELECT
        dt.ID,
        dt.ad_id,
        (MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
        MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
        FROM
        (
        SELECT
        ID,
        ad_id,
        amount_time,
        ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
        ORDER BY amount_desc) AS row_no
        FROM ads
        ) AS dt
        GROUP BY dt.ID, dt.ad_id
        -- to remove cases where there is no second highest
        -- when there is no second highest amount, then the difference will be null
        -- because 5 - null = null
        HAVING diff_amount_time IS NOT NULL





        share|improve this answer















        In MySQL 8.0.2+, simplest and possibly most performant way would be to use Window Functions.



        We will use Row_Number() function, which will determine the row number values within a combination of ID and ad_id. Row number will be based on descending order amount_time value. So the highest amount_time value will have row number of 1, and the second highest would have row number of 2.



        Now, we will use this result-set as a Derived Table, and aggregate (GROUP BY) on ID and ad_id. We can use conditional CASE .. WHEN expressions, to determine the difference between the highest and second highest value within every group.



        SELECT
        dt.ID,
        dt.ad_id,
        (MAX(CASE WHEN dt.row_no = 1 THEN dt.amount_time END) -
        MAX(CASE WHEN dt.row_no = 2 THEN dt.amount_time END)) AS diff_amount_time
        FROM
        (
        SELECT
        ID,
        ad_id,
        amount_time,
        ROW_NUMBER() OVER (PARTITION BY CONCAT(ID, '-', ad_id)
        ORDER BY amount_desc) AS row_no
        FROM ads
        ) AS dt
        GROUP BY dt.ID, dt.ad_id
        -- to remove cases where there is no second highest
        -- when there is no second highest amount, then the difference will be null
        -- because 5 - null = null
        HAVING diff_amount_time IS NOT NULL






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 16 '18 at 19:49

























        answered Nov 16 '18 at 19:37









        Chowkidar Madhur BhaiyaChowkidar Madhur Bhaiya

        19.8k62336




        19.8k62336
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53343625%2ffind-difference-between-max-and-second-max-grouped-by-columns-in-mysql-8-0-13%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Xamarin.iOS Cant Deploy on Iphone

            Glorious Revolution

            Dulmage-Mendelsohn matrix decomposition in Python