-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRA: support pod priority and preemption #4981
Comments
/wg device-management As discussed during KubeCon NA 2024, there is precedence for enabling new functionality in the kube-scheduler as beta with enabled by default. A feature gate needs to be provided to allow disabling the new functionality. Whether this KEP is suitable for this remains to be seen and will depend on the complexity. |
Is the scope of this preemption based only on priority, or would it also include cases like where an allocated device is removed from a ResourceSlice or a device's attributes in a ResourceSlice change such that an allocated ResourceClaim's selector no longer matches? |
/cc |
This is only about situations where the scheduler needs to free up devices to make them available for other pods. A device that is removed from a ResourceSlice doesn't fall under that because it wouldn't help to make some other pod scheduleable. Same for "selector no longer matches". Both fall more under general health monitoring. They indicate potentially abnormal scenarios where an admin might want to automatically kill pods, similar to untolerated taints. But it is not absolutely certain whether workloads really need to be killed. Perhaps the admin is doing some maintenance work and the hardware is still there and functional. In particular "removed from a ResourceSlice" is very similar to "ResourceSlice removed", which should not cause workloads to get killed because it can happen during driver upgrades. |
Enhancement Description
The implementation of the scheduler plugin for structured parameters does not implement the PreFilterExtensions hooks. Therefore the scheduler cannot determine whether preemption should be used to free up some device that is currently in use by a low-priority pod to enable running a high-priority pod.
The proposal is to support pod priority and preemption because then a mix of high and low priority pods is expected to run more efficiently.
k/enhancements
) update PR(s):k/k
) update PR(s):k/website
) update PR(s):The text was updated successfully, but these errors were encountered: