-
-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add host and disk aliases to respective Worker dicts #2670
Conversation
Travis errors are not related to this PR. |
distributed/worker.py
Outdated
Dictionary mapping keys to actual values stored on disk, always empty | ||
if memory_limit is not defined and neither memory_target_fraction | ||
nor memory_spill_fraction are defined, implying spilling host memory | ||
to disk is disabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend that we still keep these on the worker.data
object, as we do in dask-cuda.
So
worker.memory -> worker.data.memory
worker.disk -> worker.data.disk
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my bad. I'll fix it.
distributed/worker.py
Outdated
@@ -488,6 +498,9 @@ def __init__( | |||
else: | |||
self.data = dict() | |||
|
|||
self.host = self.data.fast.d if hasattr(self.data, "fast") else self.data | |||
self.disk = self.data.slow.d if hasattr(self.data, "slow") else dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can only do this if we're using the zict/buffer solution (see the conditional just above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to make sure that they're both defined no matter which conditional is true. Since there are several elif
s, I would need to define host
in all of them, which isn't very nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I realize what's your suggestion now. Well, for both this and the previous one, we need to define self.data
as a custom object, maybe HostFile
(analogous to DeviceHostFile
). Is there a simpler way I'm not thinking of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see now the else
.
I think that we should only have memory
and disk
attributes if they're using zict. Otherwise I'm not too concerned. Only expert users will turn off zict. I'm still in favor of adding self.data.memory
and self.data.disk
only if we're in the branch above where we're creating the Buffer. Otherwise I wouldn't bother.
My aversion to adding .memory
and .disk
to the top level Worker
namespace is mostly that it's already crowded and that these are useful, but probably not that useful. It's also less clear what they mean in that context. It's also not something that we can guarantee if the user provides a different mapping type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should only have
memory
anddisk
attributes if they're using zict. Otherwise I'm not too concerned. Only expert users will turn off zict. I'm still in favor of addingself.data.memory
andself.data.disk
only if we're in the branch above where we're creating the Buffer. Otherwise I wouldn't bother.
That's fine by me. But my previous comment is still valid, since self.data
is a zict.Buffer
, we can't simply add attribute host
and disk
, I think we would need to add a new object, something like HostFile
. Or is there a simpler way of doing that you're thinking of?
Actually, due to the permissivity of Python, we can
|
Ah right, I was on the wrong conda environment at first, that's why it didn't seem to work when I first attempted. Done in the latest commit. I've kept the |
I would prefer memory rather than host. The "host" nomenclature is only known to GPU users. This will confuse the vast majority of Dask users today. I don't think that we should optimize for CUDA users yet. |
I understand your point, and personally, I'd agree with that if dask-cuda didn't exist yet. However, I do think that we should try to be future-proof to some extent, and |
I disagree. The term host is not known to non-GPU users. |
I stand by my point, documentation should always be there to clarify. But then what about having |
The term
Stepping into my core Dask maintainer role, I'm not interested in tailoring the Dask experience to Dask-cuda, except in making things more general. If the Dask-Cuda project wants to monkey-patch things, that's fine, but we're not going to affect the majority of users with CUDA terminology. |
No further comments from me. |
Sorry to force things
…On Wed, May 8, 2019 at 1:23 PM Peter Andreas Entschev < ***@***.***> wrote:
No further comments from me.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2670 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACKZTFPTY5WC7OBF3QDE6TPUMLA3ANCNFSM4HLP6IDA>
.
|
Thanks @pentschev ! This is in. |
* upstream/master: Add WeakSet _instances attributes to all classes (dask#2673) Cap worker's memory limit by the hard limit of the maximum resident memory (dask#2665) Switch from (ip, port) to address in tests (dask#2684) Catch RuntimeError to avoid serialization fail when using pytorch (dask#2619) Add CONTRIBUTING.md (dask#2680) Consolidate logic around services (dask#2679) Fix uri_from_host_port import in dask-mpi (dask#2683) Move dashboard_address logic into Scheduler/Worker (dask#2678) Fix pytest.config deprecation warning (dask#2677) Use config accessor method for "scheduler-address" (dask#2676) Add memory and disk aliases to Worker.data (dask#2670) Move interface/host/port handling from CLI to classes (dask#2667) Remove AioClient (dask#2668) Add release procedure doc (dask#2672) bump version to 1.28.0
@mrocklin as per our discussion in rapidsai/dask-cuda#35