Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent associating multiple ElementalHosts #65

Merged

Conversation

anmazzotti
Copy link
Collaborator

Props to @juadk for finding this corner case.

For context, the ElementalMachine <--> ElementalHost association works like the following, from the ElementalMachineController perspective:

  1. Patch the ElementalHost candidate to reference the ElementalMachine (ElementalHost --> ElementalMachine)
  2. Patch the ElementalMachine to reference the ElementalHost candidate (ElementalMachine --> ElementalHost)

It can happen that 2. fails after 1. succeeds, this will lead to another ElementalMachine reconcile loop, where we are going to find a new ElementalHost. The one already linked is going to be excluded from a followup search, since it's already labeled as associated.

It will look like this, where the same ElementalMachine is associated to 2 (or more) ElementalHosts:

gh-runner@elemental-e2e-ci-379bc19f05ef448dae0f2808ee53cec9:~> kubectl get elementalhosts -A
NAMESPACE   NAME       CLUSTER             MACHINE                                 ELEMENTALMACHINE                        PHASE     READY   AGE
default     node-001   elemental-cluster   elemental-cluster-control-plane-df69z   elemental-cluster-control-plane-x7g5d   Running   True    8m14s
default     node-003   elemental-cluster   elemental-cluster-md-0-tdtqm-pzpk6      elemental-cluster-md-0-tdtqm-pzpk6      Running   True    8m14s
default     node-002   elemental-cluster   elemental-cluster-md-0-tdtqm-pzpk6      elemental-cluster-md-0-tdtqm-pzpk6      Running   True    8m14s

This patch introduces a prior lookup to find any already-linked ElementalHost, and use it as a candidate to finalize association.

@anmazzotti anmazzotti requested a review from a team as a code owner June 17, 2024 12:40
@@ -140,7 +140,7 @@ func setupAllWithManager(k8sManager manager.Manager) {
Client: k8sManager.GetClient(),
Scheme: k8sManager.GetScheme(),
Tracker: remoteTrackerMock,
RequeuePeriod: time.Millisecond,
RequeuePeriod: time.Second,
Copy link
Collaborator Author

@anmazzotti anmazzotti Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a random change.
This makes the tests now take ~26seconds in total, from ~24 (on my system), however the logs are a thousand times less spammy when things fail.

@anmazzotti anmazzotti self-assigned this Jun 17, 2024
Copy link
Contributor

@fgiudici fgiudici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@anmazzotti anmazzotti merged commit f39d1ec into rancher-sandbox:main Jun 19, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants