225: Add support for exclusive plans #260

jrodonnell · 2023-08-29T00:06:10Z

Add new field to Plan spec, Exclusive. This field will be useful when the order of execution for a series of Jobs is important, as each exclusive Job will have to complete before the next one is scheduled.

Setting the field to true will change the default behavior of how Job Pods are scheduled: it will create a PodAntiAffinity rule which will prevent more than one exclusive labeled SUC Job Pod from being scheduled to a node at a time, and will add this label to the Job Pod.

If omitted or explicitly set to false, Job Pods will be scheduled according to current default behavior.

Implements request in #225.

@dweomer

… change the default behavior of how Job Pods are scheduled: it will create a different PodAntiAffinity rule which will prevent more than one SUC Pod from running on a node at a time. If omitted or explicitly set to false, Job Pods will be scheduled according to current default behavior. Added a helper function in job.go to evalute the new bool and return the correct PodAntiAffinity rule.

dweomer

This looks fine to me but please consider my comments: I do not think the onePlanPerNode function is necessary and it obscures the retention of the existing default behavior and that the added bool on the Plan is opt-in.

dweomer · 2023-08-29T02:53:09Z

pkg/upgrade/job/job.go

-											plan.Name,
-										},
-									}},
+									MatchExpressions: onePlanPerNode(plan, node),


I reviewed this before encouraging @jrodonnell to submit his PR but this bit I didn't spend enough time thinking about it (something about PR submissions forces my brain to actually think, or something): I'm not a fan of the function and do not think it necessary besides.

If we leave the default initialization of the single match-expression as-is, so as to indicate such is the ... default value/behavior we can then combine this with a conditional check further down (say, after where we conditionally initialize job.spec.Parallelism) to override the key and values for the match expression when indicated by the new opt-in bool.

Yeah, I don't like replacing the entire expression here with a call to onePlanPerNode, as the function is responsible for generating the LabelSelectorRequirement regardless of whether or not the new option is in use. Can we leave this as-is and either mutate or add to the []metav1.LabelSelectorRequirement later, if the option is set?

I agree that makes more sense, as it is only changing the default if the flag is set. I will try to switch this out today.

Pushed this change but left the flag as-is just for now.

New flag name added into new mutation logic, new label added as well. Tested and working in my own cluster, exclusive Pods will wait for each other (but not for non-exclusive Pods) on a per-node basis and non-exclusive ones will still schedule themselves normally.

dweomer · 2023-08-29T02:54:33Z

pkg/upgrade/job/job.go

@@ -359,3 +353,30 @@ func New(plan *upgradeapiv1.Plan, node *corev1.Node, controllerName string) *bat

 	return job
 }
+
+func onePlanPerNode(plan *upgradeapiv1.Plan, node *corev1.Node) []metav1.LabelSelectorRequirement {


See my comment in-place where this function is used: I don't think it is necessary.

dweomer · 2023-08-29T03:08:01Z

/cc @brandond @briandowns

brandond · 2023-08-29T16:37:03Z

prevent more than one SUC Pod from running on a node at a time

I'm confused as to why this is necessary; isn't the documented way to do this via something like the k3s-upgrade "prepare" step, where each subsequent plan knows the name of the previous plan, and waits for it to complete before moving into the upgrade phase?

This behavior will be useful when the order of execution for a series of Jobs is important, as the Job will have to complete before the next one is scheduled.

If the desire is to bring this closer to full workflow engine with strict sequencing and inter-job dependency tracking built into the controller, instead of leaning on the images to figure it out for themselves, just slapping a flag on the Plan that prevents it from running alongside any other Plan doesn't seem like it gets us there?

dweomer · 2023-08-29T17:39:26Z

The gist of this work is a relatively simple tweak to the plan to satisfy the ask in #225 which seeks to prevent multiple plans from applying on a node at the same time (as opposed to the default anti-affinity that only prevents multiple applies of the same plan per node).

While this does allow for bad-actor plans to monopolize nodes it is opt-in behavior at the plan level to declaratively connote that concurrent mutations of the node should not be allowed.

brandond

A couple nits on the field name and expression generation logic.

Just to be clear, the intent is for this option to only force exclusivity/serialization among plans with the option set. Only a single "exclusive" or "serialized" plan may run at a time, alongside zero or more "non-exclusive" plans.

pkg/apis/upgrade.cattle.io/v1/types.go

brandond · 2023-08-29T17:48:54Z

pkg/upgrade/job/job.go

-											plan.Name,
-										},
-									}},
+									MatchExpressions: onePlanPerNode(plan, node),


Yeah, I don't like replacing the entire expression here with a call to onePlanPerNode, as the function is responsible for generating the LabelSelectorRequirement regardless of whether or not the new option is in use. Can we leave this as-is and either mutate or add to the []metav1.LabelSelectorRequirement later, if the option is set?

jrodonnell · 2023-08-30T14:46:27Z

Just to be clear, the intent is for this option to only force exclusivity/serialization among plans with the option set. Only a single "exclusive" or "serialized" plan may run at a time, alongside zero or more "non-exclusive" plans.

I believe the intent was to make an exclusive Plan block any other Plan from running, regardless of any other Plan's settings, for a few reasons. I see this exclusivity as an important statement of fact and intent, that running any other Plans while this exclusive one is running is a bad idea because it will be doing things to the node that could break other Jobs in the middle of execution and thus leave the node in an unknown state, and allowing that declaration to be effectively overridden or ignored by other Plans would seem to defeat that purpose. Under those same assumptions, requiring all Plans to have the flag in order to comply with one Plan’s exclusivity requirement would be counterintuitive to me and would also force all Plans into serial execution even if only one of them actually needed to be exclusive (or force the operator into a two-step process: apply exclusive Plans, wait to finish, then apply normal ones).

Whereas if the exclusive Plans just block all other Plans while they are running, then when they are done the rest will just automatically execute according to the current default scheduling. To me that seems like the most intuitive behavior that operators can easily understand and leverage in practice.

brandond · 2023-08-30T16:29:08Z

I believe the intent was to make an exclusive Plan block any other Plan from running, regardless of any other Plan's settings.

Can we make this configurable as part of controller settings? I feel like this is a major change in the Plan contract, that administrators should have to opt into. I think that by default exclusive plans should only block other exclusive plans. If administrators want to allow an exclusive plan to block all other plans, they can enable that in the controller config - with an awareness that this may change the behavior of other plans already deployed to the cluster.

jrodonnell · 2023-08-30T22:02:01Z

I feel like this is a major change in the Plan contract

In that case, maybe it does make more sense to think of making Plans serial rather than exclusive. The operator could still treat them as exclusive by letting the serial ones finish before submitting more non-serial Plans. The anti-affinity rule would have to be based off a different one than the one here, probably a new unique label (perhaps upgrade.cattle.io/serial: true) but I can take a crack at that too.

@davidcassany as the one who opened the original issue I am also curious to hear what you think of this discussion, do you think this new flag should force exclusivity of the Plan with the flag and prevent any other Plan from running or enforce serial scheduling of all Plans with the flag and ones without continue being scheduled as normal now?

brandond · 2023-08-31T00:04:14Z

The operator could still treat them as exclusive by letting the serial ones finish before submitting more non-serial Plans.

I also want to keep in mind that lots of folks don't treat Plans as a one-shot operations and delete them afterwards. There may be a standing plan to upgrade to the latest release in a channel, for example. In that case the Plan will finish, but execute again automatically when the channel is updated.

… simple if statement.

jrodonnell · 2023-08-31T23:46:29Z

I also want to keep in mind that lots of folks don't treat Plans as a one-shot operations and delete them afterwards. There may be a standing plan to upgrade to the latest release in a channel, for example. In that case the Plan will finish, but execute again automatically when the channel is updated.

That is some good additional context to have, I did not realize that was a common use case. I think it sheds some more light on the original request, which was to provide a way to prevent concurrent Plans from running on a cluster. But long-running or not, I think it's reasonable to expect an operator to tag their Plan appropriately if it is potentially disruptive/sensitive to disruption or not, so I think I agree with what you're saying.

I think this would work: give every Plan Pod a new label, e.g. upgrade.cattle.io/exclusive: false | true and if false the current default anti-affinity rule is used, effectively ignoring this new label, and if true then the anti-affinity rule will be based on this new label and any new exclusive Pods will wait their turn.

… anti-affinity rule on it.

pkg/apis/upgrade.cattle.io/constants.go

pkg/upgrade/job/job.go

Co-authored-by: Brad Davidson <[email protected]>

dweomer approved these changes Aug 29, 2023

View reviewed changes

brandond requested changes Aug 29, 2023

View reviewed changes

Restore default MatchExpressions and replace onePlanPerNode func with…

f92e3d1

… simple if statement.

Change flag name to Exclusive, create new label with it, and base new…

ad8d935

… anti-affinity rule on it.

brandond requested changes Sep 5, 2023

View reviewed changes

pkg/apis/upgrade.cattle.io/constants.go Outdated Show resolved Hide resolved

pkg/upgrade/job/job.go Show resolved Hide resolved

Update pkg/apis/upgrade.cattle.io/constants.go

3dcbbd4

Co-authored-by: Brad Davidson <[email protected]>

brandond approved these changes Sep 29, 2023

View reviewed changes

brandond changed the title ~~225: Prevent concurrent plans on nodes~~ 225: Add support for exclusive plans Sep 29, 2023

johnatasr approved these changes Sep 29, 2023

View reviewed changes

brandond merged commit 5014b4b into rancher:master Sep 29, 2023

davidcassany mentioned this pull request Mar 7, 2024

Define a meaningful behavior and life cycle options of ManagedOSImage resource rancher/elemental#1268

Closed

davidcassany mentioned this pull request Sep 18, 2024

[Epic] Improve OS upgrade logic by using latest features rancher/elemental#1565

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

225: Add support for exclusive plans #260

225: Add support for exclusive plans #260

jrodonnell commented Aug 29, 2023 •

edited by zube bot

Loading

dweomer left a comment

dweomer Aug 29, 2023 •

edited

Loading

brandond Aug 29, 2023

jrodonnell Aug 30, 2023

jrodonnell Aug 31, 2023

jrodonnell Sep 1, 2023

dweomer Aug 29, 2023

dweomer commented Aug 29, 2023

brandond commented Aug 29, 2023 •

edited

Loading

dweomer commented Aug 29, 2023 •

edited

Loading

brandond left a comment

brandond Aug 29, 2023

jrodonnell commented Aug 30, 2023

brandond commented Aug 30, 2023 •

edited

Loading

jrodonnell commented Aug 30, 2023

brandond commented Aug 31, 2023 •

edited

Loading

jrodonnell commented Aug 31, 2023

225: Add support for exclusive plans #260

225: Add support for exclusive plans #260

Conversation

jrodonnell commented Aug 29, 2023 • edited by zube bot Loading

dweomer left a comment

Choose a reason for hiding this comment

dweomer Aug 29, 2023 • edited Loading

Choose a reason for hiding this comment

brandond Aug 29, 2023

Choose a reason for hiding this comment

jrodonnell Aug 30, 2023

Choose a reason for hiding this comment

jrodonnell Aug 31, 2023

Choose a reason for hiding this comment

jrodonnell Sep 1, 2023

Choose a reason for hiding this comment

dweomer Aug 29, 2023

Choose a reason for hiding this comment

dweomer commented Aug 29, 2023

brandond commented Aug 29, 2023 • edited Loading

dweomer commented Aug 29, 2023 • edited Loading

brandond left a comment

Choose a reason for hiding this comment

brandond Aug 29, 2023

Choose a reason for hiding this comment

jrodonnell commented Aug 30, 2023

brandond commented Aug 30, 2023 • edited Loading

jrodonnell commented Aug 30, 2023

brandond commented Aug 31, 2023 • edited Loading

jrodonnell commented Aug 31, 2023

jrodonnell commented Aug 29, 2023 •

edited by zube bot

Loading

dweomer Aug 29, 2023 •

edited

Loading

brandond commented Aug 29, 2023 •

edited

Loading

dweomer commented Aug 29, 2023 •

edited

Loading

brandond commented Aug 30, 2023 •

edited

Loading

brandond commented Aug 31, 2023 •

edited

Loading