Understanding Dask's Task Stream
I'm running dask locally using the distributed scheduler on my machine with 8 cores. On initialization I see:
Which looks correct, but I'm confused by the task stream in the diagnostics (shown below):
I was expecting 8 rows corresponding to the 8 workers/cores, is that incorrect?
Thanks
AJ
I've added the code I'm running:
import dask.dataframe as dd
from dask.distributed import Client, progress
client = Client()
progress(client)
# load datasets
trd = (dd.read_csv('trade_201811*.csv', compression='gzip',
blocksize=None, dtype={'Notional': 'float64'})
.assign(timestamp=lambda x: dd.to_datetime(x.timestamp.str.replace('D', 'T')))
.set_index('timestamp', sorted=True))
python-3.x dask
add a comment |
I'm running dask locally using the distributed scheduler on my machine with 8 cores. On initialization I see:
Which looks correct, but I'm confused by the task stream in the diagnostics (shown below):
I was expecting 8 rows corresponding to the 8 workers/cores, is that incorrect?
Thanks
AJ
I've added the code I'm running:
import dask.dataframe as dd
from dask.distributed import Client, progress
client = Client()
progress(client)
# load datasets
trd = (dd.read_csv('trade_201811*.csv', compression='gzip',
blocksize=None, dtype={'Notional': 'float64'})
.assign(timestamp=lambda x: dd.to_datetime(x.timestamp.str.replace('D', 'T')))
.set_index('timestamp', sorted=True))
python-3.x dask
add a comment |
I'm running dask locally using the distributed scheduler on my machine with 8 cores. On initialization I see:
Which looks correct, but I'm confused by the task stream in the diagnostics (shown below):
I was expecting 8 rows corresponding to the 8 workers/cores, is that incorrect?
Thanks
AJ
I've added the code I'm running:
import dask.dataframe as dd
from dask.distributed import Client, progress
client = Client()
progress(client)
# load datasets
trd = (dd.read_csv('trade_201811*.csv', compression='gzip',
blocksize=None, dtype={'Notional': 'float64'})
.assign(timestamp=lambda x: dd.to_datetime(x.timestamp.str.replace('D', 'T')))
.set_index('timestamp', sorted=True))
python-3.x dask
I'm running dask locally using the distributed scheduler on my machine with 8 cores. On initialization I see:
Which looks correct, but I'm confused by the task stream in the diagnostics (shown below):
I was expecting 8 rows corresponding to the 8 workers/cores, is that incorrect?
Thanks
AJ
I've added the code I'm running:
import dask.dataframe as dd
from dask.distributed import Client, progress
client = Client()
progress(client)
# load datasets
trd = (dd.read_csv('trade_201811*.csv', compression='gzip',
blocksize=None, dtype={'Notional': 'float64'})
.assign(timestamp=lambda x: dd.to_datetime(x.timestamp.str.replace('D', 'T')))
.set_index('timestamp', sorted=True))
python-3.x dask
python-3.x dask
edited Nov 15 '18 at 16:15
Andy Johnson
asked Nov 14 '18 at 13:33
Andy JohnsonAndy Johnson
757
757
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Each line corresponds to a single thread. Some more sophisticated Dask operations will start up additional threads, this happens particularly when tasks launch other tasks, which is common especially in machine learning workloads.
My guess is that you're using one of the following approaches:
dask.distributed.get_client
ordask.distributed.worker_client
- Scikit-Learn's Joblib
- Dask-ML
If so, the behavior that you're seeing is normal. The task stream plot will look a little odd, yes, but hopefully it is still interpretable.
Ok, makes sense although as I'm running on an 8 core machine with 8 workers, I just assumed any subtasks/procs/threads would appear on one of the existing 8 rows.
– Andy Johnson
Nov 15 '18 at 16:25
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53301463%2funderstanding-dasks-task-stream%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Each line corresponds to a single thread. Some more sophisticated Dask operations will start up additional threads, this happens particularly when tasks launch other tasks, which is common especially in machine learning workloads.
My guess is that you're using one of the following approaches:
dask.distributed.get_client
ordask.distributed.worker_client
- Scikit-Learn's Joblib
- Dask-ML
If so, the behavior that you're seeing is normal. The task stream plot will look a little odd, yes, but hopefully it is still interpretable.
Ok, makes sense although as I'm running on an 8 core machine with 8 workers, I just assumed any subtasks/procs/threads would appear on one of the existing 8 rows.
– Andy Johnson
Nov 15 '18 at 16:25
add a comment |
Each line corresponds to a single thread. Some more sophisticated Dask operations will start up additional threads, this happens particularly when tasks launch other tasks, which is common especially in machine learning workloads.
My guess is that you're using one of the following approaches:
dask.distributed.get_client
ordask.distributed.worker_client
- Scikit-Learn's Joblib
- Dask-ML
If so, the behavior that you're seeing is normal. The task stream plot will look a little odd, yes, but hopefully it is still interpretable.
Ok, makes sense although as I'm running on an 8 core machine with 8 workers, I just assumed any subtasks/procs/threads would appear on one of the existing 8 rows.
– Andy Johnson
Nov 15 '18 at 16:25
add a comment |
Each line corresponds to a single thread. Some more sophisticated Dask operations will start up additional threads, this happens particularly when tasks launch other tasks, which is common especially in machine learning workloads.
My guess is that you're using one of the following approaches:
dask.distributed.get_client
ordask.distributed.worker_client
- Scikit-Learn's Joblib
- Dask-ML
If so, the behavior that you're seeing is normal. The task stream plot will look a little odd, yes, but hopefully it is still interpretable.
Each line corresponds to a single thread. Some more sophisticated Dask operations will start up additional threads, this happens particularly when tasks launch other tasks, which is common especially in machine learning workloads.
My guess is that you're using one of the following approaches:
dask.distributed.get_client
ordask.distributed.worker_client
- Scikit-Learn's Joblib
- Dask-ML
If so, the behavior that you're seeing is normal. The task stream plot will look a little odd, yes, but hopefully it is still interpretable.
answered Nov 14 '18 at 16:59
MRocklinMRocklin
25.9k1468127
25.9k1468127
Ok, makes sense although as I'm running on an 8 core machine with 8 workers, I just assumed any subtasks/procs/threads would appear on one of the existing 8 rows.
– Andy Johnson
Nov 15 '18 at 16:25
add a comment |
Ok, makes sense although as I'm running on an 8 core machine with 8 workers, I just assumed any subtasks/procs/threads would appear on one of the existing 8 rows.
– Andy Johnson
Nov 15 '18 at 16:25
Ok, makes sense although as I'm running on an 8 core machine with 8 workers, I just assumed any subtasks/procs/threads would appear on one of the existing 8 rows.
– Andy Johnson
Nov 15 '18 at 16:25
Ok, makes sense although as I'm running on an 8 core machine with 8 workers, I just assumed any subtasks/procs/threads would appear on one of the existing 8 rows.
– Andy Johnson
Nov 15 '18 at 16:25
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53301463%2funderstanding-dasks-task-stream%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown