Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI celery queue on dalco cluster #1672

Closed
mguidon opened this issue Aug 5, 2020 · 5 comments · Fixed by #1673
Closed

MPI celery queue on dalco cluster #1672

mguidon opened this issue Aug 5, 2020 · 5 comments · Fixed by #1673
Assignees
Labels
a:sidecar issue related with the sidecar worker service

Comments

@mguidon
Copy link
Member

mguidon commented Aug 5, 2020

USER STORY:

We have 5 nodes on the dalco cluster. One of them has 48 CPUS and 700 sth GB of RAM. I want to run iSolve (a comp. service) on that machine using MPI parallelism. I will tag the corresponding image with a label "MPI". I want to have exactly one sidecar running there so that different jobs do not fight for the resources.

Since it is not quite clear yet how many users will use that feature and since we do not want to waste too much computational resources I would also like to have possibility to optionally run "normal" sidecars on that node.

DEFINITION OF DONE:

I create 2 projects filepicker->iSolve(MPI). When I run the two projects, the solvers run on the MPI node one after the other using all the cores.

@mguidon mguidon added the a:sidecar issue related with the sidecar worker service label Aug 5, 2020
@GitHK
Copy link
Contributor

GitHK commented Aug 5, 2020

Proposed solution

Because machine(s) used to run MPI tasks, have a specific amount of total available CPUs and because this number differs from other type machines in the cluster. I would propose the following solution:

  • When the sidecar container starts, it will check in sequence if IS_MPI_NODE and IS_GPU_NODE. It will become the first available type of sidecar with in the following order: [MPI, GPU, CPU]. In the end all sidecars will be CPU sidecars if all checks fail.
  • MPI check:
    • an environment variable will be passed to the sidecar service containing the number of CPUs needed to become an MPI node. The container will run cat /proc/cpuinfo | grep processor | wc -l to determine the CPU count.
    • If a node can be MPI, it will acquire a Redlock with name containing MPI and the number of CPUs for this specific MPI node. This guarantees no other sidecars can become an MPI node
    • If the node also has a GPU, all other sidecars (trying to start) will become a GPU sidecars.
    • If the node has no GPU, all other sidecars (trying to start) will become CPU sidecars.

The above will guarantee that in every given cluster there will be only one node dedicated to running a single MPI sidecar. The rest of the sidecars scheduled on that node will be either GPU sidecars or CPU sidecars, based on the available node's configuration/resources.

Observations

  • The placement of the MPI label will be assumed to be at the same level as the VRAM label for the GPU services.
  • In development mode the sidecar service will start with 3 copies (up from 1). Only one of these will become an MPI node.
  • The implementation and dispatching will be similar to the GPU solution.

@mguidon
Copy link
Member Author

mguidon commented Aug 5, 2020

re (trying to start): I guess you will take care that this is configurable? (i.e. having optionally other sidecars running on the MPI node)

@GitHK
Copy link
Contributor

GitHK commented Aug 5, 2020

re (trying to start): I guess you will take care that this is configurable? (i.e. having optionally other sidecars running on the MPI node)

What I mean is that if more then one sidecar is configured on that node, the rest will become either GPU sidecars or CPU sidecars, based on how to node was configured.

@mguidon
Copy link
Member Author

mguidon commented Aug 5, 2020

re (trying to start): I guess you will take care that this is configurable? (i.e. having optionally other sidecars running on the MPI node)

What I mean is that if more then one sidecar is configured on that node, the rest will become either GPU sidecars or CPU sidecars, based on how to node was configured.

Thats what I had in mind. Thanks.

@GitHK
Copy link
Contributor

GitHK commented Aug 5, 2020

I will proceed with implementing this, and have the issue linked to a PR. The implementation will also bring a sleeper-mpi service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:sidecar issue related with the sidecar worker service
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants