Postgresql 10 - Parallel configuration












5














There are 4 configurations to enable the parallel and do the optimization, but the documentation of PostgreSQL doesn't says anything about values or calculation. My questions are:




1- How to calculate the values of max_parallel_workers,
max_parallel_workers_per_gather and max_worker_processes?



2- The work_mem can be calculate on base of connections and
memory(RAM), but the work_mem needs to change something if I enable
the parallel?




My supposition is: if the machine has 8 cores the max_parallel_workers is 8 and the values of worker process and per gather are 32(8*4), the number 4 I took from the original configuration that is 4 gathers per 1 parallel work.










share|improve this question





























    5














    There are 4 configurations to enable the parallel and do the optimization, but the documentation of PostgreSQL doesn't says anything about values or calculation. My questions are:




    1- How to calculate the values of max_parallel_workers,
    max_parallel_workers_per_gather and max_worker_processes?



    2- The work_mem can be calculate on base of connections and
    memory(RAM), but the work_mem needs to change something if I enable
    the parallel?




    My supposition is: if the machine has 8 cores the max_parallel_workers is 8 and the values of worker process and per gather are 32(8*4), the number 4 I took from the original configuration that is 4 gathers per 1 parallel work.










    share|improve this question



























      5












      5








      5


      1





      There are 4 configurations to enable the parallel and do the optimization, but the documentation of PostgreSQL doesn't says anything about values or calculation. My questions are:




      1- How to calculate the values of max_parallel_workers,
      max_parallel_workers_per_gather and max_worker_processes?



      2- The work_mem can be calculate on base of connections and
      memory(RAM), but the work_mem needs to change something if I enable
      the parallel?




      My supposition is: if the machine has 8 cores the max_parallel_workers is 8 and the values of worker process and per gather are 32(8*4), the number 4 I took from the original configuration that is 4 gathers per 1 parallel work.










      share|improve this question















      There are 4 configurations to enable the parallel and do the optimization, but the documentation of PostgreSQL doesn't says anything about values or calculation. My questions are:




      1- How to calculate the values of max_parallel_workers,
      max_parallel_workers_per_gather and max_worker_processes?



      2- The work_mem can be calculate on base of connections and
      memory(RAM), but the work_mem needs to change something if I enable
      the parallel?




      My supposition is: if the machine has 8 cores the max_parallel_workers is 8 and the values of worker process and per gather are 32(8*4), the number 4 I took from the original configuration that is 4 gathers per 1 parallel work.







      postgresql postgresql-10






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Oct 31 at 8:24

























      asked Oct 30 at 8:14









      HudsonPH

      1,24411025




      1,24411025
























          2 Answers
          2






          active

          oldest

          votes


















          1














          Question is quite obvious, but answer is not.



          I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.



          First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather, gather-merge.



          Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...



          Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.



          Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter, so after all there is one process that deals with I-O. It rely on system, but still, one process.



          So answer is far from perfect - you need to try, collect your own stats, and decide.






          share|improve this answer





















          • Is not close to an answer that I want but thanks.
            – HudsonPH
            Nov 12 at 11:31










          • That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
            – Michał Zaborowski
            Nov 12 at 22:47



















          0














          After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):



          your max worker processes will be the amount of cores and the max parallel needs to have the same amount:



          max_worker_processes = 4
          max_parallel_workers = 4


          the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.



          max_parallel_workers_per_gather = 2


          This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.






          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53059944%2fpostgresql-10-parallel-configuration%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            Question is quite obvious, but answer is not.



            I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.



            First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather, gather-merge.



            Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...



            Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.



            Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter, so after all there is one process that deals with I-O. It rely on system, but still, one process.



            So answer is far from perfect - you need to try, collect your own stats, and decide.






            share|improve this answer





















            • Is not close to an answer that I want but thanks.
              – HudsonPH
              Nov 12 at 11:31










            • That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
              – Michał Zaborowski
              Nov 12 at 22:47
















            1














            Question is quite obvious, but answer is not.



            I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.



            First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather, gather-merge.



            Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...



            Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.



            Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter, so after all there is one process that deals with I-O. It rely on system, but still, one process.



            So answer is far from perfect - you need to try, collect your own stats, and decide.






            share|improve this answer





















            • Is not close to an answer that I want but thanks.
              – HudsonPH
              Nov 12 at 11:31










            • That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
              – Michał Zaborowski
              Nov 12 at 22:47














            1












            1








            1






            Question is quite obvious, but answer is not.



            I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.



            First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather, gather-merge.



            Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...



            Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.



            Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter, so after all there is one process that deals with I-O. It rely on system, but still, one process.



            So answer is far from perfect - you need to try, collect your own stats, and decide.






            share|improve this answer












            Question is quite obvious, but answer is not.



            I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.



            First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather, gather-merge.



            Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...



            Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.



            Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter, so after all there is one process that deals with I-O. It rely on system, but still, one process.



            So answer is far from perfect - you need to try, collect your own stats, and decide.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 8 at 13:41









            Michał Zaborowski

            2,8952828




            2,8952828












            • Is not close to an answer that I want but thanks.
              – HudsonPH
              Nov 12 at 11:31










            • That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
              – Michał Zaborowski
              Nov 12 at 22:47


















            • Is not close to an answer that I want but thanks.
              – HudsonPH
              Nov 12 at 11:31










            • That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
              – Michał Zaborowski
              Nov 12 at 22:47
















            Is not close to an answer that I want but thanks.
            – HudsonPH
            Nov 12 at 11:31




            Is not close to an answer that I want but thanks.
            – HudsonPH
            Nov 12 at 11:31












            That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
            – Michał Zaborowski
            Nov 12 at 22:47




            That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
            – Michał Zaborowski
            Nov 12 at 22:47













            0














            After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):



            your max worker processes will be the amount of cores and the max parallel needs to have the same amount:



            max_worker_processes = 4
            max_parallel_workers = 4


            the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.



            max_parallel_workers_per_gather = 2


            This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.






            share|improve this answer


























              0














              After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):



              your max worker processes will be the amount of cores and the max parallel needs to have the same amount:



              max_worker_processes = 4
              max_parallel_workers = 4


              the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.



              max_parallel_workers_per_gather = 2


              This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.






              share|improve this answer
























                0












                0








                0






                After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):



                your max worker processes will be the amount of cores and the max parallel needs to have the same amount:



                max_worker_processes = 4
                max_parallel_workers = 4


                the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.



                max_parallel_workers_per_gather = 2


                This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.






                share|improve this answer












                After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):



                your max worker processes will be the amount of cores and the max parallel needs to have the same amount:



                max_worker_processes = 4
                max_parallel_workers = 4


                the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.



                max_parallel_workers_per_gather = 2


                This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 12 at 14:12









                HudsonPH

                1,24411025




                1,24411025






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53059944%2fpostgresql-10-parallel-configuration%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Xamarin.iOS Cant Deploy on Iphone

                    Glorious Revolution

                    Dulmage-Mendelsohn matrix decomposition in Python