-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a warning to discourage users from launching a KubeRay-incompatible autoscaler. #1102
Conversation
+1 to leaving the monitor process in existence and disabling features like autoscaling if needed. @kevin85421 I'm a little fuzzy on the overall integration with Ray. Could you summarize what the
|
@wuisawesome Thank you for the comment!
|
LGTM! I'm a bit curious, though. Under what circumstances would a user need to configure If a user do provide |
@Yicheng-Lu-llll thank you for your comments!
Honestly, I do not know. I would prefer not to hardcode autoscaling-config here, considering the lesson we learned from the init container injection. Based on the comments in ReadOnlyNodeProvider, the READONLY autoscaler also serves the purpose of collecting cluster state. Therefore, users may configure
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm. I almost wonder if we should take the more opinionated stance and require the --no-monitor
flag at all times, but will defer to you on this.
Discuss with @wuisawesome offline:
|
…le autoscaler. (ray-project#1102) Add a warning to discourage users from launching a KubeRay-incompatible autoscaler.
More context about |
Why are these changes needed?
The
no-monitor
option in rayStartParams is used to determine whether themonitor.py
process should be launched or not. The monitor process is not only for autoscaling but observability (e.g. Prometheus).KubeRay autoscaling is enabled (i.e.
EnableInTreeAutoscaling: true
)Ray autoscaler supports various node providers such as AWS, GCP, Azure, and Kubernetes. However, the default autoscaler is not compatible with Kubernetes. Therefore, we disable the monitor process by default via setting
no-monitor
totrue
and inject a sidecar container for KubeRay autoscaler.kuberay/ray-operator/controllers/ray/common/pod.go
Lines 114 to 115 in 08792ca
KubeRay autoscaling is disabled
Since the default value of
no-monitor
in Ray is false when not explicitly specified, the monitor process, along with the Kubernetes-incompatible autoscaler, will be launched by default. Therefore, the purpose of this pull request is to initiate the monitor process without including the autoscaler.@wuisawesome said that we can disable the autoscaling feature by making the autoscaler become READONLY.
https://sourcegraph.com/github.com/ray-project/ray@ac2230fd28f642e4523422a1e1eb15ab92f63943/-/blob/python/ray/autoscaler/_private/monitor.py?L220-L230
If
autoscaling-config
is not set in the head Pod'srayStartParams
,BASE_READONLY_CONFIG
will be used, which creates a READONLY autoscaler by default. Hence, in this PR, our focus is to prevent users from launching a KubeRay-incompatible autoscaler when they specify theautoscaling-config
themselves.Related issue number
Checks
I've made sure the tests are passing.
Testing Strategy
StandardAutoscaler
above is the same as BASE_READONLY_CONFIG ifautoscaling-config
is not set.Set
autoscaling-config