Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ray] Support scheduling ray tasks in Ray oscar deploy backend #3165

Merged
merged 16 commits into from
Sep 7, 2022

Conversation

chaokunyang
Copy link
Contributor

What do these changes do?

This PR add support for scheduling ray tasks in Ray oscar deploy backend

Related issue number

Closes #3164

Check code requirements

  • tests added / passed (if needed)
  • Ensure all linting tests pass, see here for how to run them

@chaokunyang chaokunyang requested a review from a team as a code owner June 23, 2022 10:35
@fyrestone
Copy link
Contributor

Why not create an actor of Mars LocalCluster of mars/deploy/oscar/local.py? This is much simpler and most of the logic is shared with Mars which is well tested in CI.

@chaokunyang chaokunyang force-pushed the support_ray_task_deploy branch from bcbac7d to 706bdf4 Compare June 27, 2022 07:00
@chaokunyang
Copy link
Contributor Author

Why not create an actor of Mars LocalCluster of mars/deploy/oscar/local.py? This is much simpler and most of the logic is shared with Mars which is well tested in CI.

Creating supervisor in Ray oscar can reuse all existing integration with ray. If we create an actor of Mars LocalCluster of mars/deploy/oscar/local.py, the serialization and communication between mars client and supervisor will be an issue:

  • We need to implement a proxy for LocalCluster based on ray call
  • ObjectRef can't be passed to supervisor from mars client directly

@chaokunyang chaokunyang force-pushed the support_ray_task_deploy branch 4 times, most recently from e75fb36 to 775304e Compare August 15, 2022 07:22
@chaokunyang chaokunyang force-pushed the support_ray_task_deploy branch from 775304e to 11bf076 Compare August 24, 2022 06:12
@chaokunyang chaokunyang force-pushed the support_ray_task_deploy branch from 2e6c8d0 to dae169d Compare September 6, 2022 03:32
@chaokunyang chaokunyang force-pushed the support_ray_task_deploy branch 2 times, most recently from ebf00f4 to 944a094 Compare September 6, 2022 03:45
@chaokunyang chaokunyang force-pushed the support_ray_task_deploy branch from 944a094 to a7ea6b6 Compare September 6, 2022 03:46
fyrestone
fyrestone previously approved these changes Sep 6, 2022
Copy link
Contributor

@fyrestone fyrestone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

self._worker_mem,
)
else:
execution_config.merge_from(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to specify the backend explicitly like ray or something others instead of else statement.
And we can raise an exception if user passes an unsupported backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, we should validate the passed backend type, just fixed it.

Copy link
Contributor

@zhongchun zhongchun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@fyrestone fyrestone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chaokunyang chaokunyang merged commit 8ee68e2 into mars-project:master Sep 7, 2022
qianduoduo0904 pushed a commit to qianduoduo0904/mars that referenced this pull request Sep 22, 2022
…project#3165)

* support ray task mode on oscar

* fix mars on ray serialization

* fix ray dag serialization for local mode

* refine clean serializers

* fix register ray_serializers

* fix tests

* revert load_config

* support mars supervisor for ray task mode

* fix use_ray_serialization

* fix register and clean ray_serializers

* fix test_sync_execute create session duplciately

* fix tests

* lint

* add test_ray_dag_oscar to ci

* validate backend parameter

(cherry picked from commit 8ee68e2)
UranusSeven pushed a commit to xorbitsai/mars that referenced this pull request Sep 22, 2022
…project#3165)

* support ray task mode on oscar

* fix mars on ray serialization

* fix ray dag serialization for local mode

* refine clean serializers

* fix register ray_serializers

* fix tests

* revert load_config

* support mars supervisor for ray task mode

* fix use_ray_serialization

* fix register and clean ray_serializers

* fix test_sync_execute create session duplciately

* fix tests

* lint

* add test_ray_dag_oscar to ci

* validate backend parameter

(cherry picked from commit 8ee68e2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Ray] Support supervisor on oscar for ray task mode
3 participants