You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Infrastructure management and compute orchestration is critical to production Ray users and users likes to scale their applications in an infinite compute environment with zero code changes. Since Kubernetes becomes de-facto container orchestrator for enterprise, users leverage Kubernetes as a substrate for execution of distributed Ray programs.
Community provides a Python ray operator implementation. However, due to some special needs, Ant Group, Microsoft and Bytedance put some efforts to build a Golang based operator and decouple autoscaler from operator itself (see design for details). All of us are using this solution in our production environments.
Due to historical reason, we have three folders named with company name in this project. After our collaboration in #17, there's only one ray-operator under ray-contrib which is big step for further evolution.
Proposal
In order to keep reducing maintenance efforts and simplifying user experience, more tools around Kubernetes and Ray operator become better developed. That means we plan to contribute more tools to this repo. However, we feel current repo name is not properly used. Technically speaking, any ray project could be added here. We think it might be better to reorganize Kubernetes related work in a separate repo which concentrate on Ray user's experiences on Kubernetes.
Besides ray-operator, some tools we plan to work on or already developed in downstream are
Kubectl plugin/CLI to operate CRD objects
Kubernetes event dumper for ray clusters/pod/services
Maybe we can call it KubeRay, a toolkit consist of different Kubernetes components and user can choose combination based on their Kubernetes environments. I think create a new repo like ray-project/kuberay is better and ray-contrib can be used for some incubated ideas. I think KubeRay will help attract more people participate in the community and It also help grows ray’s influence in CNCF/Kubernetes community. Lots of users are moving ML/DL workloads to Kubernetes and they should try Ray using this solution.
A toolkit which consist of different Kubernetes components can help user to deploy and manage ray cluster on Kubernetes, not only ray-operator. I think KubeRay is good idea.
Background
Infrastructure management and compute orchestration is critical to production Ray users and users likes to scale their applications in an infinite compute environment with zero code changes. Since Kubernetes becomes de-facto container orchestrator for enterprise, users leverage Kubernetes as a substrate for execution of distributed Ray programs.
Community provides a Python ray operator implementation. However, due to some special needs, Ant Group, Microsoft and Bytedance put some efforts to build a Golang based operator and decouple autoscaler from operator itself (see design for details). All of us are using this solution in our production environments.
Due to historical reason, we have three folders named with company name in this project. After our collaboration in #17, there's only one
ray-operator
underray-contrib
which is big step for further evolution.Proposal
In order to keep reducing maintenance efforts and simplifying user experience, more tools around Kubernetes and Ray operator become better developed. That means we plan to contribute more tools to this repo. However, we feel current repo name is not properly used. Technically speaking, any ray project could be added here. We think it might be better to reorganize Kubernetes related work in a separate repo which concentrate on Ray user's experiences on Kubernetes.
Besides ray-operator, some tools we plan to work on or already developed in downstream are
(credits @chenk008 @caitengwei from AntGroup, @Jeffwan from Bytedance and @akanso from Microsoft)
Maybe we can call it
KubeRay
, a toolkit consist of different Kubernetes components and user can choose combination based on their Kubernetes environments. I think create a new repo likeray-project/kuberay
is better andray-contrib
can be used for some incubated ideas. I thinkKubeRay
will help attract more people participate in the community and It also help grows ray’s influence in CNCF/Kubernetes community. Lots of users are moving ML/DL workloads to Kubernetes and they should try Ray using this solution.WDYT? Any feedbacks are welcomed!
/cc @chenk008 @caitengwei @akanso @chaomengyuan
/cc @zhe-thoughts @ericl @richardliaw @DmitriGekhtman @yiranwang52
The text was updated successfully, but these errors were encountered: