Postgresql 10 - Parallel configuration
There are 4 configurations to enable the parallel and do the optimization, but the documentation of PostgreSQL doesn't says anything about values or calculation. My questions are:
1- How to calculate the values of
max_parallel_workers
,
max_parallel_workers_per_gather
andmax_worker_processes
?
2- The
work_mem
can be calculate on base of connections and
memory(RAM), but thework_mem
needs to change something if I enable
the parallel?
My supposition is: if the machine has 8 cores the max_parallel_workers
is 8 and the values of worker process and per gather are 32(8*4), the number 4 I took from the original configuration that is 4 gathers per 1 parallel work.
postgresql postgresql-10
add a comment |
There are 4 configurations to enable the parallel and do the optimization, but the documentation of PostgreSQL doesn't says anything about values or calculation. My questions are:
1- How to calculate the values of
max_parallel_workers
,
max_parallel_workers_per_gather
andmax_worker_processes
?
2- The
work_mem
can be calculate on base of connections and
memory(RAM), but thework_mem
needs to change something if I enable
the parallel?
My supposition is: if the machine has 8 cores the max_parallel_workers
is 8 and the values of worker process and per gather are 32(8*4), the number 4 I took from the original configuration that is 4 gathers per 1 parallel work.
postgresql postgresql-10
add a comment |
There are 4 configurations to enable the parallel and do the optimization, but the documentation of PostgreSQL doesn't says anything about values or calculation. My questions are:
1- How to calculate the values of
max_parallel_workers
,
max_parallel_workers_per_gather
andmax_worker_processes
?
2- The
work_mem
can be calculate on base of connections and
memory(RAM), but thework_mem
needs to change something if I enable
the parallel?
My supposition is: if the machine has 8 cores the max_parallel_workers
is 8 and the values of worker process and per gather are 32(8*4), the number 4 I took from the original configuration that is 4 gathers per 1 parallel work.
postgresql postgresql-10
There are 4 configurations to enable the parallel and do the optimization, but the documentation of PostgreSQL doesn't says anything about values or calculation. My questions are:
1- How to calculate the values of
max_parallel_workers
,
max_parallel_workers_per_gather
andmax_worker_processes
?
2- The
work_mem
can be calculate on base of connections and
memory(RAM), but thework_mem
needs to change something if I enable
the parallel?
My supposition is: if the machine has 8 cores the max_parallel_workers
is 8 and the values of worker process and per gather are 32(8*4), the number 4 I took from the original configuration that is 4 gathers per 1 parallel work.
postgresql postgresql-10
postgresql postgresql-10
edited Oct 31 at 8:24
asked Oct 30 at 8:14
HudsonPH
1,24411025
1,24411025
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Question is quite obvious, but answer is not.
I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.
First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes
. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather
, gather-merge
.
Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...
Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes
. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.
Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter
, so after all there is one process that deals with I-O. It rely on system, but still, one process.
So answer is far from perfect - you need to try, collect your own stats, and decide.
Is not close to an answer that I want but thanks.
– HudsonPH
Nov 12 at 11:31
That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
– Michał Zaborowski
Nov 12 at 22:47
add a comment |
After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):
your max worker processes will be the amount of cores and the max parallel needs to have the same amount:
max_worker_processes = 4
max_parallel_workers = 4
the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.
max_parallel_workers_per_gather = 2
This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53059944%2fpostgresql-10-parallel-configuration%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Question is quite obvious, but answer is not.
I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.
First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes
. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather
, gather-merge
.
Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...
Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes
. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.
Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter
, so after all there is one process that deals with I-O. It rely on system, but still, one process.
So answer is far from perfect - you need to try, collect your own stats, and decide.
Is not close to an answer that I want but thanks.
– HudsonPH
Nov 12 at 11:31
That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
– Michał Zaborowski
Nov 12 at 22:47
add a comment |
Question is quite obvious, but answer is not.
I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.
First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes
. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather
, gather-merge
.
Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...
Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes
. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.
Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter
, so after all there is one process that deals with I-O. It rely on system, but still, one process.
So answer is far from perfect - you need to try, collect your own stats, and decide.
Is not close to an answer that I want but thanks.
– HudsonPH
Nov 12 at 11:31
That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
– Michał Zaborowski
Nov 12 at 22:47
add a comment |
Question is quite obvious, but answer is not.
I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.
First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes
. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather
, gather-merge
.
Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...
Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes
. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.
Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter
, so after all there is one process that deals with I-O. It rely on system, but still, one process.
So answer is far from perfect - you need to try, collect your own stats, and decide.
Question is quite obvious, but answer is not.
I'll try to describe is a bit wider, so if something sounds obvious to you - just skip it.
First - how it works is described here here. What that parameters are for is described here. In other words - PG has pool of processes that can do something in background. Maximum number of them is limited by max_worker_processes
. When table scan is performed it can take long, long time, so it would be wise to have more processes that are taking data. That can be done in background, by... background workers. Query plan node which can be done by them are: gather
, gather-merge
.
Each background worker has it's memory - for sorting and other things related to execution. They are there all the time, so it is better to have that in mind, just to be sure system is not using swap...
Besides that. Try to find out best number of workers per query - by default that is 2. So if everything works fine, there are two background workers used to gather data. Next question is how many queries are executed in parallel. I mean - that heavy queries that require parallel processing. Having that two numbers say - 4 workers per query, and 10 queries - there are 40 workers needed, only for that. You can calculate if that is OK, or experiment with that. This way or the other - there is one more parameter - max_worker_processes
. Having that 40 workers for parallel processing - you need more workers for other tasks - like replication.
Is that 40 sounds reasonable? There are two counter points here - by default PG is OLTP database. So system is prepared for something else, and that kind of change can bring good results. On the other hand - there is one bgwriter
, so after all there is one process that deals with I-O. It rely on system, but still, one process.
So answer is far from perfect - you need to try, collect your own stats, and decide.
answered Nov 8 at 13:41
Michał Zaborowski
2,8952828
2,8952828
Is not close to an answer that I want but thanks.
– HudsonPH
Nov 12 at 11:31
That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
– Michał Zaborowski
Nov 12 at 22:47
add a comment |
Is not close to an answer that I want but thanks.
– HudsonPH
Nov 12 at 11:31
That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
– Michał Zaborowski
Nov 12 at 22:47
Is not close to an answer that I want but thanks.
– HudsonPH
Nov 12 at 11:31
Is not close to an answer that I want but thanks.
– HudsonPH
Nov 12 at 11:31
That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
– Michał Zaborowski
Nov 12 at 22:47
That goes really tricky. For instance - I have no idea where your answer is any better... For instance - for slaves you need more max workers then parallel - if the replication should work. I know their idea about number of active processes and number of cores - however I would check if with particular I/O conditions they are still valid. That is one of reasons why they decided to have safe settings here. Long tradition...
– Michał Zaborowski
Nov 12 at 22:47
add a comment |
After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):
your max worker processes will be the amount of cores and the max parallel needs to have the same amount:
max_worker_processes = 4
max_parallel_workers = 4
the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.
max_parallel_workers_per_gather = 2
This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.
add a comment |
After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):
your max worker processes will be the amount of cores and the max parallel needs to have the same amount:
max_worker_processes = 4
max_parallel_workers = 4
the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.
max_parallel_workers_per_gather = 2
This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.
add a comment |
After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):
your max worker processes will be the amount of cores and the max parallel needs to have the same amount:
max_worker_processes = 4
max_parallel_workers = 4
the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.
max_parallel_workers_per_gather = 2
This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.
After some searching I found some answers, this can help who wants to enable and to have a base configuration, case you have 4 cores(CPU):
your max worker processes will be the amount of cores and the max parallel needs to have the same amount:
max_worker_processes = 4
max_parallel_workers = 4
the gather is more complex because this value can be manipulated base on your needs and resource it is necessary to test to get a best value, but to startup values you can use cores/2.
max_parallel_workers_per_gather = 2
This is not a final answer, there some missing points... I am still searching and updating this answer or waiting for a better one.
answered Nov 12 at 14:12
HudsonPH
1,24411025
1,24411025
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53059944%2fpostgresql-10-parallel-configuration%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown