Overhead of multiple synchronized on the same object
Consider this code:
void A() {
synchronized (obj) {
for (int i = 0; i < 1000; i++) {
B();
}
}
}
void B() {
synchronized (obj) {
// Do something
}
}
How much will be the overhead of "synchronized" in calling A? Will it be close to the overhead of only one "synchronized"?
java synchronized
|
show 5 more comments
Consider this code:
void A() {
synchronized (obj) {
for (int i = 0; i < 1000; i++) {
B();
}
}
}
void B() {
synchronized (obj) {
// Do something
}
}
How much will be the overhead of "synchronized" in calling A? Will it be close to the overhead of only one "synchronized"?
java synchronized
What do you mean with "overhead"? Memory consumption? Runtime? Something else?
– Korashen
Nov 15 '18 at 16:26
I mean Runtime!
– Shayan
Nov 15 '18 at 16:27
etutorials.org/Programming/Java+performance+tuning/…
– fantaghirocco
Nov 15 '18 at 16:27
It will be more, as synchronized is getting called 1000 times more than in the single call use case. So the VM has to put 1000 additional lock tokens onobj
.
– Korashen
Nov 15 '18 at 16:28
@Korashen Won't it understand that it has already acquired the lock?
– Shayan
Nov 15 '18 at 16:29
|
show 5 more comments
Consider this code:
void A() {
synchronized (obj) {
for (int i = 0; i < 1000; i++) {
B();
}
}
}
void B() {
synchronized (obj) {
// Do something
}
}
How much will be the overhead of "synchronized" in calling A? Will it be close to the overhead of only one "synchronized"?
java synchronized
Consider this code:
void A() {
synchronized (obj) {
for (int i = 0; i < 1000; i++) {
B();
}
}
}
void B() {
synchronized (obj) {
// Do something
}
}
How much will be the overhead of "synchronized" in calling A? Will it be close to the overhead of only one "synchronized"?
java synchronized
java synchronized
edited Nov 15 '18 at 16:31
Shayan
asked Nov 15 '18 at 16:23
ShayanShayan
1,21352647
1,21352647
What do you mean with "overhead"? Memory consumption? Runtime? Something else?
– Korashen
Nov 15 '18 at 16:26
I mean Runtime!
– Shayan
Nov 15 '18 at 16:27
etutorials.org/Programming/Java+performance+tuning/…
– fantaghirocco
Nov 15 '18 at 16:27
It will be more, as synchronized is getting called 1000 times more than in the single call use case. So the VM has to put 1000 additional lock tokens onobj
.
– Korashen
Nov 15 '18 at 16:28
@Korashen Won't it understand that it has already acquired the lock?
– Shayan
Nov 15 '18 at 16:29
|
show 5 more comments
What do you mean with "overhead"? Memory consumption? Runtime? Something else?
– Korashen
Nov 15 '18 at 16:26
I mean Runtime!
– Shayan
Nov 15 '18 at 16:27
etutorials.org/Programming/Java+performance+tuning/…
– fantaghirocco
Nov 15 '18 at 16:27
It will be more, as synchronized is getting called 1000 times more than in the single call use case. So the VM has to put 1000 additional lock tokens onobj
.
– Korashen
Nov 15 '18 at 16:28
@Korashen Won't it understand that it has already acquired the lock?
– Shayan
Nov 15 '18 at 16:29
What do you mean with "overhead"? Memory consumption? Runtime? Something else?
– Korashen
Nov 15 '18 at 16:26
What do you mean with "overhead"? Memory consumption? Runtime? Something else?
– Korashen
Nov 15 '18 at 16:26
I mean Runtime!
– Shayan
Nov 15 '18 at 16:27
I mean Runtime!
– Shayan
Nov 15 '18 at 16:27
etutorials.org/Programming/Java+performance+tuning/…
– fantaghirocco
Nov 15 '18 at 16:27
etutorials.org/Programming/Java+performance+tuning/…
– fantaghirocco
Nov 15 '18 at 16:27
It will be more, as synchronized is getting called 1000 times more than in the single call use case. So the VM has to put 1000 additional lock tokens on
obj
.– Korashen
Nov 15 '18 at 16:28
It will be more, as synchronized is getting called 1000 times more than in the single call use case. So the VM has to put 1000 additional lock tokens on
obj
.– Korashen
Nov 15 '18 at 16:28
@Korashen Won't it understand that it has already acquired the lock?
– Shayan
Nov 15 '18 at 16:29
@Korashen Won't it understand that it has already acquired the lock?
– Shayan
Nov 15 '18 at 16:29
|
show 5 more comments
2 Answers
2
active
oldest
votes
The answer to this (legitimate) question depends on the OS, hardware and specific VM implementation.
Putting aside the cost of a function call, it may cost near to nothing on one OS/architecture (consider modern processor/OS/VM) and much more on another (consider purely software processor emulation). On a single green thread VM it may cost near to zero (except the call overhead). The cost will differ even between ARM and Intel of a comparable power.
synchronized() is usually implemented inside a VM by using OS synchronization primitives, with some heuristics to speed up common cases. OS, in turn, uses hardware instructions and heuristics to perform this task. Usually, subsequent acquisition of an already acquired synchronization primitive is exceptionally efficient in an OS and is very efficient on a typical production grade VM.
On modern Windows/Linux VM and Intel/AMD processor, usually, it doesn't cost a lot of CPU cycles (assuming otherwise idle machine) and is in the low nanoseconds range.
Note, in general, it is a very complex topic. Multiple layers of software, hardware (and the effect of other tasks running on the same hardware resources) are involved. Rigorous research of even a small sub-topic here can compose multiple Ph.D. thesis.
In practice, though, my advice is to assume the cost of a second synchronized in small loops to be zero unless you encounter a particular bottleneck (which is quite unlikely).
If there is a large number of iterations, it definitely will increase the cost vs single synchronized, and the overall effect depends on what you are doing inside the loop. Usually, there is some work in each iteration making the relative overhead negligible. But for some cases, it may prevent loop optimization and add a substantial overhead (substantial comparing to single synchronized, not as a practical measure). However, in common practical cases of huge loops, one should think about different design and avoid performing the outer synchronized to reduce lock contention.
To get a sense about VM implementation you may look, for example, into the Synchronization section of this paper. It is a bit outdated but is straightforward to understand.
add a comment |
synchronized
locks are re-entrant and acquiring a lock when a thread is already holding a lock is a) the time to check it already hols the lock, b) the time to increment a counter and later decrement it.
The first one takes the longest and adds about 10 - 50 ns each time.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53323776%2foverhead-of-multiple-synchronized-on-the-same-object%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The answer to this (legitimate) question depends on the OS, hardware and specific VM implementation.
Putting aside the cost of a function call, it may cost near to nothing on one OS/architecture (consider modern processor/OS/VM) and much more on another (consider purely software processor emulation). On a single green thread VM it may cost near to zero (except the call overhead). The cost will differ even between ARM and Intel of a comparable power.
synchronized() is usually implemented inside a VM by using OS synchronization primitives, with some heuristics to speed up common cases. OS, in turn, uses hardware instructions and heuristics to perform this task. Usually, subsequent acquisition of an already acquired synchronization primitive is exceptionally efficient in an OS and is very efficient on a typical production grade VM.
On modern Windows/Linux VM and Intel/AMD processor, usually, it doesn't cost a lot of CPU cycles (assuming otherwise idle machine) and is in the low nanoseconds range.
Note, in general, it is a very complex topic. Multiple layers of software, hardware (and the effect of other tasks running on the same hardware resources) are involved. Rigorous research of even a small sub-topic here can compose multiple Ph.D. thesis.
In practice, though, my advice is to assume the cost of a second synchronized in small loops to be zero unless you encounter a particular bottleneck (which is quite unlikely).
If there is a large number of iterations, it definitely will increase the cost vs single synchronized, and the overall effect depends on what you are doing inside the loop. Usually, there is some work in each iteration making the relative overhead negligible. But for some cases, it may prevent loop optimization and add a substantial overhead (substantial comparing to single synchronized, not as a practical measure). However, in common practical cases of huge loops, one should think about different design and avoid performing the outer synchronized to reduce lock contention.
To get a sense about VM implementation you may look, for example, into the Synchronization section of this paper. It is a bit outdated but is straightforward to understand.
add a comment |
The answer to this (legitimate) question depends on the OS, hardware and specific VM implementation.
Putting aside the cost of a function call, it may cost near to nothing on one OS/architecture (consider modern processor/OS/VM) and much more on another (consider purely software processor emulation). On a single green thread VM it may cost near to zero (except the call overhead). The cost will differ even between ARM and Intel of a comparable power.
synchronized() is usually implemented inside a VM by using OS synchronization primitives, with some heuristics to speed up common cases. OS, in turn, uses hardware instructions and heuristics to perform this task. Usually, subsequent acquisition of an already acquired synchronization primitive is exceptionally efficient in an OS and is very efficient on a typical production grade VM.
On modern Windows/Linux VM and Intel/AMD processor, usually, it doesn't cost a lot of CPU cycles (assuming otherwise idle machine) and is in the low nanoseconds range.
Note, in general, it is a very complex topic. Multiple layers of software, hardware (and the effect of other tasks running on the same hardware resources) are involved. Rigorous research of even a small sub-topic here can compose multiple Ph.D. thesis.
In practice, though, my advice is to assume the cost of a second synchronized in small loops to be zero unless you encounter a particular bottleneck (which is quite unlikely).
If there is a large number of iterations, it definitely will increase the cost vs single synchronized, and the overall effect depends on what you are doing inside the loop. Usually, there is some work in each iteration making the relative overhead negligible. But for some cases, it may prevent loop optimization and add a substantial overhead (substantial comparing to single synchronized, not as a practical measure). However, in common practical cases of huge loops, one should think about different design and avoid performing the outer synchronized to reduce lock contention.
To get a sense about VM implementation you may look, for example, into the Synchronization section of this paper. It is a bit outdated but is straightforward to understand.
add a comment |
The answer to this (legitimate) question depends on the OS, hardware and specific VM implementation.
Putting aside the cost of a function call, it may cost near to nothing on one OS/architecture (consider modern processor/OS/VM) and much more on another (consider purely software processor emulation). On a single green thread VM it may cost near to zero (except the call overhead). The cost will differ even between ARM and Intel of a comparable power.
synchronized() is usually implemented inside a VM by using OS synchronization primitives, with some heuristics to speed up common cases. OS, in turn, uses hardware instructions and heuristics to perform this task. Usually, subsequent acquisition of an already acquired synchronization primitive is exceptionally efficient in an OS and is very efficient on a typical production grade VM.
On modern Windows/Linux VM and Intel/AMD processor, usually, it doesn't cost a lot of CPU cycles (assuming otherwise idle machine) and is in the low nanoseconds range.
Note, in general, it is a very complex topic. Multiple layers of software, hardware (and the effect of other tasks running on the same hardware resources) are involved. Rigorous research of even a small sub-topic here can compose multiple Ph.D. thesis.
In practice, though, my advice is to assume the cost of a second synchronized in small loops to be zero unless you encounter a particular bottleneck (which is quite unlikely).
If there is a large number of iterations, it definitely will increase the cost vs single synchronized, and the overall effect depends on what you are doing inside the loop. Usually, there is some work in each iteration making the relative overhead negligible. But for some cases, it may prevent loop optimization and add a substantial overhead (substantial comparing to single synchronized, not as a practical measure). However, in common practical cases of huge loops, one should think about different design and avoid performing the outer synchronized to reduce lock contention.
To get a sense about VM implementation you may look, for example, into the Synchronization section of this paper. It is a bit outdated but is straightforward to understand.
The answer to this (legitimate) question depends on the OS, hardware and specific VM implementation.
Putting aside the cost of a function call, it may cost near to nothing on one OS/architecture (consider modern processor/OS/VM) and much more on another (consider purely software processor emulation). On a single green thread VM it may cost near to zero (except the call overhead). The cost will differ even between ARM and Intel of a comparable power.
synchronized() is usually implemented inside a VM by using OS synchronization primitives, with some heuristics to speed up common cases. OS, in turn, uses hardware instructions and heuristics to perform this task. Usually, subsequent acquisition of an already acquired synchronization primitive is exceptionally efficient in an OS and is very efficient on a typical production grade VM.
On modern Windows/Linux VM and Intel/AMD processor, usually, it doesn't cost a lot of CPU cycles (assuming otherwise idle machine) and is in the low nanoseconds range.
Note, in general, it is a very complex topic. Multiple layers of software, hardware (and the effect of other tasks running on the same hardware resources) are involved. Rigorous research of even a small sub-topic here can compose multiple Ph.D. thesis.
In practice, though, my advice is to assume the cost of a second synchronized in small loops to be zero unless you encounter a particular bottleneck (which is quite unlikely).
If there is a large number of iterations, it definitely will increase the cost vs single synchronized, and the overall effect depends on what you are doing inside the loop. Usually, there is some work in each iteration making the relative overhead negligible. But for some cases, it may prevent loop optimization and add a substantial overhead (substantial comparing to single synchronized, not as a practical measure). However, in common practical cases of huge loops, one should think about different design and avoid performing the outer synchronized to reduce lock contention.
To get a sense about VM implementation you may look, for example, into the Synchronization section of this paper. It is a bit outdated but is straightforward to understand.
edited Nov 15 '18 at 18:00
answered Nov 15 '18 at 17:31
Fedor LosevFedor Losev
2,771812
2,771812
add a comment |
add a comment |
synchronized
locks are re-entrant and acquiring a lock when a thread is already holding a lock is a) the time to check it already hols the lock, b) the time to increment a counter and later decrement it.
The first one takes the longest and adds about 10 - 50 ns each time.
add a comment |
synchronized
locks are re-entrant and acquiring a lock when a thread is already holding a lock is a) the time to check it already hols the lock, b) the time to increment a counter and later decrement it.
The first one takes the longest and adds about 10 - 50 ns each time.
add a comment |
synchronized
locks are re-entrant and acquiring a lock when a thread is already holding a lock is a) the time to check it already hols the lock, b) the time to increment a counter and later decrement it.
The first one takes the longest and adds about 10 - 50 ns each time.
synchronized
locks are re-entrant and acquiring a lock when a thread is already holding a lock is a) the time to check it already hols the lock, b) the time to increment a counter and later decrement it.
The first one takes the longest and adds about 10 - 50 ns each time.
edited Nov 15 '18 at 16:52
answered Nov 15 '18 at 16:34
Peter LawreyPeter Lawrey
447k56571973
447k56571973
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53323776%2foverhead-of-multiple-synchronized-on-the-same-object%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What do you mean with "overhead"? Memory consumption? Runtime? Something else?
– Korashen
Nov 15 '18 at 16:26
I mean Runtime!
– Shayan
Nov 15 '18 at 16:27
etutorials.org/Programming/Java+performance+tuning/…
– fantaghirocco
Nov 15 '18 at 16:27
It will be more, as synchronized is getting called 1000 times more than in the single call use case. So the VM has to put 1000 additional lock tokens on
obj
.– Korashen
Nov 15 '18 at 16:28
@Korashen Won't it understand that it has already acquired the lock?
– Shayan
Nov 15 '18 at 16:29