You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When launching a cluster with a cluster manager like SSHCluster there are four different places where worker info exists.
On the worker system that I am SSHing into there is an instance of distributed.worker.Worker.
On my local system there is an instance of distributed.deploy.ssh.Worker which manages the SSH subprocess. This is a subclass of ProcessInterface which is being discussed in this PR.
On the scheduler system there is an instance of distributed.scheduler.Scheduler which manages a dictionary of scheduler_info that is kept in sync by the worker heartbeat.
On my local system both the SSHCluster and Client objects have a copy of the scheduler_info dictionary from the Scheduler via the RPC.
Today the HTML repr which shows scheduler and worker info is on that scheduler_info object. This is because the scheduler_info is the simplest way of getting and showing this information to the user.
There are some things to think about here:
Do the Worker and ProcessInterface objects have all the same information about the workers that scheduler_info does? (I think no)
Should distributed.worker.Worker and distributed.deploy.ssh.Worker have reprs that look like the worker dropdowns in scheduler_info?
Should Cluster and Client reuse the reprs from the Worker objects instead of creating its own representation for them?
Today the HTML repr which shows scheduler and worker info is on that scheduler_info object. This is because the scheduler_info is the simplest way of getting and showing this information to the user.
I imagine there are a lot of people for whom scheduler_info is the only thing they look at regularly for information about the scheduler/workers. I think it's good to have this kind of "one stop shop", but it's worthwhile remembering that lots of people won't go digging around in other places.
Should Cluster and Client reuse the reprs from the Worker objects instead of creating its own representation for them?
Given you think we should do both this seems like a good opportunity to make use of jinja2 includes once dask/dask#8019 lands. That way we can create a worker template which is used by the Worker repr and included in the scheduler_info repr.
When launching a cluster with a cluster manager like
SSHCluster
there are four different places where worker info exists.distributed.worker.Worker
.distributed.deploy.ssh.Worker
which manages the SSH subprocess. This is a subclass ofProcessInterface
which is being discussed in this PR.distributed.scheduler.Scheduler
which manages a dictionary ofscheduler_info
that is kept in sync by the worker heartbeat.SSHCluster
andClient
objects have a copy of thescheduler_info
dictionary from theScheduler
via the RPC.Today the HTML repr which shows scheduler and worker info is on that
scheduler_info
object. This is because thescheduler_info
is the simplest way of getting and showing this information to the user.There are some things to think about here:
Worker
andProcessInterface
objects have all the same information about the workers thatscheduler_info
does? (I think no)distributed.worker.Worker
anddistributed.deploy.ssh.Worker
have reprs that look like the worker dropdowns inscheduler_info
?Cluster
andClient
reuse the reprs from theWorker
objects instead of creating its own representation for them?Originally posted by @jacobtomlinson in #5181 (comment)
The text was updated successfully, but these errors were encountered: