Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster component repr hierarchy #5216

Open
jacobtomlinson opened this issue Aug 16, 2021 · 2 comments
Open

Cluster component repr hierarchy #5216

jacobtomlinson opened this issue Aug 16, 2021 · 2 comments

Comments

@jacobtomlinson
Copy link
Member

When launching a cluster with a cluster manager like SSHCluster there are four different places where worker info exists.

  • On the worker system that I am SSHing into there is an instance of distributed.worker.Worker.
  • On my local system there is an instance of distributed.deploy.ssh.Worker which manages the SSH subprocess. This is a subclass of ProcessInterface which is being discussed in this PR.
  • On the scheduler system there is an instance of distributed.scheduler.Scheduler which manages a dictionary of scheduler_info that is kept in sync by the worker heartbeat.
  • On my local system both the SSHCluster and Client objects have a copy of the scheduler_info dictionary from the Scheduler via the RPC.

Today the HTML repr which shows scheduler and worker info is on that scheduler_info object. This is because the scheduler_info is the simplest way of getting and showing this information to the user.

There are some things to think about here:

  • Do the Worker and ProcessInterface objects have all the same information about the workers that scheduler_info does? (I think no)
  • Should distributed.worker.Worker and distributed.deploy.ssh.Worker have reprs that look like the worker dropdowns in scheduler_info?
  • Should Cluster and Client reuse the reprs from the Worker objects instead of creating its own representation for them?

Originally posted by @jacobtomlinson in #5181 (comment)

@GenevieveBuckley
Copy link
Contributor

Thanks @jacobtomlinson for starting this discussion.

Today the HTML repr which shows scheduler and worker info is on that scheduler_info object. This is because the scheduler_info is the simplest way of getting and showing this information to the user.

I imagine there are a lot of people for whom scheduler_info is the only thing they look at regularly for information about the scheduler/workers. I think it's good to have this kind of "one stop shop", but it's worthwhile remembering that lots of people won't go digging around in other places.

Should Cluster and Client reuse the reprs from the Worker objects instead of creating its own representation for them?

I'd say yes, this is probably ideal.

@jacobtomlinson
Copy link
Member Author

Thanks @GenevieveBuckley.

Given you think we should do both this seems like a good opportunity to make use of jinja2 includes once dask/dask#8019 lands. That way we can create a worker template which is used by the Worker repr and included in the scheduler_info repr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants