-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator deletes all running jobs whenever the relevant scaledjob object gets updated, and recreates them based on the new scaledjob spec. #2098
Comments
@zroubalik suggested that this pr might be relevant to this change 2.2-> 2.4 |
worth mentioning that on v2.2 log looks the same "Deleting jobs owned by the previous version of the scaledJob" |
Hi @etamarw It looks happens in this point. https://github.com/kedacore/keda/blame/61740daffac2194dac91d76bb0526b2660e0acdd/controllers/keda/scaledjob_controller.go#L137 It might be possible to add configuration for this part, however, I need to understand the context. @zroubalik do you know the context why it deletes the running jobs if we update the scaled job? |
i did some debugging and found a few issues with this function:
on version 2.4, number seems to be weird(50363,21953,39908 - doesn't correlates with the number of running jobs in any way) but when there are no jobs, running jobs.Size returns 8. it means that on version 2.2 it is not deleting the jobs cause it Is not fetching them correctly. anyways, i think the main reason to use scaled job and not scaled object is to avoid terminating running pods in the middle of their work. therefore, im not really sure why we need this function at all - "Deleting jobs owned by the previous version of the scaledJob" but it is just my opinion :) thanks a lot for helping with this one guys! |
Thanks for the information @etamarw ! @TsuyoshiUshio I don't recall the context of that, it has been implemented way too loong ago and not by me :) I agree with suggested approach, to make this configurable. @etamarw are you willing to give it a try? Since you have already started with the analysis? |
ill try to dive it a try. |
hi, @zroubalik i created a relevant pr with a suggestion for a fix. Its pretty raw so ill need some guidance in order to proceed :) |
Fixed in #2164 |
Report
on version 2.2, whenever a scaledjob object got updated, the operator used to create only new jobs with the latest changes - leaving running jobs that were created by the older version of the scaledjob object to finish gracefully.
on version 2.4, the operator deletes all running jobs whenever the scaledjob objects gets updated.
this behaviour is very problematic since its causing termination of running jobs on each update.
Expected Behavior
rollout mechanism should be just like it was on version 2.2.
one of the main reasons to use scale jobs and not deployments is to avoid terminating long running operations in the middle of the run.
Actual Behavior
on version 2.4, the operator deletes all running jobs whenever the scaledjob objects gets updated.
Steps to Reproduce the Problem
reproduced on vanilla k8s cluster on on kind.
Logs from KEDA operator
KEDA Version
2.4.0
Kubernetes Version
1.19
Platform
Any
Scaler Details
prometheus
Anything else?
this behaviour seems pretty similar to this issue from keda v1:
#1021
The text was updated successfully, but these errors were encountered: