-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaling is possible during deployment with bad results #23444
Comments
Hi @lattwood! tl;dr for this scenario, you'll probably want to use First, I want to provide a little bit of context on why evaluation What's more, the order of evaluations for a given job basically doesn't matter at all! An evaluation is just a signal to the scheduler that it has work to do for that job. The broker on the leader ensures that only a single evaluation is being processed for a given job at a time (parallel among all jobs, of course), and in recent versions of Nomad we load shed evaluations that we don't need so that we have an evaluation being processed and one waiting (or blocked). You can see that happened in both your example evals
Unfortunately that's going to make it very hard to monitor a job deployment via the CLI if you're concurrently and "constantly" pushing new evals into the queue for scaling. Adding to the trouble here, deployments aren't created atomically with job registration. We don't create a deployment until the evaluation hits the scheduler (because otherwise we'd create expensive no-op deployments for in-place updates). So a legal interleaving of operations could be:
Or if the scheduler dequeues the eval before the This will definitely allow a job change for scaling to slip in between a a job registration is made and the actual deployment is created. The right way for us to improve this in Nomad would be to add support for a In the meantime, you can probably improve your shell script by checking |
wow, thanks for the detailed response. thinking in our CI/CD to work around this, we should add |
The RPC handler for scaling a job passes flags to enforce the job modify index is unchanged when it makes the write to Raft. But its only checking against the existing job modify index at the time the RPC handler snapshots the state store, so it can only enforce consistency for its own validation. In clusters with automated scaling, it would be useful to expose the enforce index options to the API, so that cluster admins can enforce that scaling only happens when the job state is consistent with a state they've previously seen in other API calls. Add this option to the CLI and API and have the RPC handler check them if asked. Fixes: #23444
PR for the |
The RPC handler for scaling a job passes flags to enforce the job modify index is unchanged when it makes the write to Raft. But its only checking against the existing job modify index at the time the RPC handler snapshots the state store, so it can only enforce consistency for its own validation. In clusters with automated scaling, it would be useful to expose the enforce index options to the API, so that cluster admins can enforce that scaling only happens when the job state is consistent with a state they've previously seen in other API calls. Add this option to the CLI and API and have the RPC handler check them if asked. Fixes: #23444
thanks @tgross! |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Super Quick tl;dr;
Evaluations have a different sort order when sorting by CreateIndex vs CreateTime, and this has impacts on deployment of jobs that do scaling.
Nomad version
Operating system and Environment details
Relevant- we're using a shell script to query nomad, and then run various
nomad job scale
commands. The shell script also checks to see if a deployment is already in progress prior to scaling, but as you'll soon see due to the timeframes involved it wouldn't have helped.Issue
When performing a deployment of a job that is constantly being scaled, instead of getting a failure to scale and the
job scaling blocked due to active deployment
error from Nomad, spooky things happen.This is the spooky thing.
I go and take a look at the evaluation it created.
Nothing out of the ordinary, save for the
NextEval
field being empty (would be nice if that was populated in this case). DeploymentID is empty but that might be expected withjob-register
, but anyways- I go into the Nomad UI armed with the timestamp of the evaluation's creation (CreateTime
), go to the evaluation directly above it in the list and look at that.Again, nothing out of the ordinary, right? Sadly, no.
46e24c94
42842e2e
3:07:48.345
3:07:48.130
An evaluation created at T+0 has a greater raft index than another evaluation created at T+215ms.
Reproduction steps
Expected Result
job scaling blocked due to active deployment
error.Actual Result
Evaluation "46e24c94" finished with status "canceled"
and a zero exit codeWorkaround Thoughts
Right now the only thing I can come up with is taking a lock on a Nomad variable prior to any deployment or scaling activity, but erm, I kinda thought that was Nomad's job. 😅
The text was updated successfully, but these errors were encountered: