-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cron scaler: Fix unexpected instance count change #3838
Conversation
fix IsActive in time boundary Signed-off-by: tux <[email protected]>
Hey, |
We use the following expression to expand the capacity from the end of 8:00 am on the last day of each month to 8:00 pm on the 1st of the next month
But the preexisting dependency library cron does not support L (the last day of month), see issue robfig/cron#464 So we use the following expression instead
But currently cronScaler.IsActive may trigger an issue like // if execTime=2022.10.30 07:59:59.999
nextStartTime, startTimecronErr := getCronTime(location, s.metadata.start)
// got nextStartTime=2022.10.30 08:00:00.000
......
nextEndTime, endTimecronErr := getCronTime(location, s.metadata.end)
// got nextEndTime=2022.11.01 20:00:00.000
......
currentTime := time.Now().Unix()
// got currentTime=2022.10.30 08:00:00.111
switch {
case nextStartTime < nextEndTime && currentTime < nextStartTime:
return false, nil
// will use this case, return true, will scale up pod
// but in the next cycle, will retrun false
case currentTime <= nextEndTime:
return true, nil
default:
return false, nil
} this will cause an unexpected scale up event when we get the currentTime before getting nextStartTime in cronScaler.IsActive, we can avoid this mistake |
@zou2699 could you please open an issue first, with clear explanation of the problem? Thanks! |
@zou2699 do you see any usecases (that don't suffer from this edge case) where this fix could break the existing behavior? |
@zroubalik This will not affect other usecases |
I think that the change doesn't change the behavior, more than changing the edge case of 59:999. It's true that the opposite case (00.001) will change, but as I said, I think that it's a really "forced" edge case and the change will be done in the next polling instead of current (30 secs later) and we are assuming that this evaluated exactly at 59.999 which is so complicated |
BTW @zou2699 , are you sure that the problem is related with this? I mean, what pooling interval do you have? Because if you are not checking ever 1 second (and even in that case), I'd say that it's really complex to happen |
The generation of nextStartTime depends on the current time point, but the currentTime here is obtained after nextStartTime is generated, which will lead to the following problems For example, in the above example, we don't want an unexpected instance count change at Through this adjustment, the nextStartTime of the time that is not the last day (October 28, October 29, October 30) is always greater than the currentTime:
to avoid unexpected instance count changes, thus supporting 'L' in Cron expressions |
@zou2699 could you please fix the static check problems and add entry to the changelog, we can merge it then. |
Signed-off-by: tux <[email protected]>
dc38527
to
b1972db
Compare
Signed-off-by: tux <[email protected]>
/run-e2e cron* |
Signed-off-by: tux <[email protected]>
fix IsActive in time boundary
Signed-off-by: tux [email protected]
Provide a description of what has been changed
Checklist
Fixes #
fix IsActive in time boundary
for example, when we use this cron configuration to scale pod at the last day of a month
if runTime is 2022-01-30T7:59:59.9999999
might happen
nextStartTime 2022-01-30T8:00:00.000000
currentTime 2022-01-30T8:00:00.000111
at this time
currentTime < nextStartTime
will be false,IsActive will return true,trigger unexpected scalerFixes: #3854