Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 fix fail-swap-on=false flag not being part of kind images anymore #8767

Merged
merged 1 commit into from
Jun 1, 2023

Conversation

chrischdi
Copy link
Member

What this PR does / why we need it:

This PR adds a json patch for KubeadmControlPlanes and MachineDeployments, to set the fail-swap-on flag for the kubelet to false in our ClusterClasses for the release, main and v1.4 tests.

This prevents that the kubelet refuses to start for newer kind images which don't specify the flag anymore (because kind sets the configuration parameter in KubeletConfiguration.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #8766

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 30, 2023
@chrischdi
Copy link
Member Author

/test help

@k8s-ci-robot
Copy link
Contributor

@chrischdi: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-test-mink8s-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main
  • /test pull-cluster-api-e2e-full-dualstack-ipv6-main
  • /test pull-cluster-api-e2e-full-main
  • /test pull-cluster-api-e2e-informing-ipv6-main
  • /test pull-cluster-api-e2e-informing-main
  • /test pull-cluster-api-e2e-scale-main-experimental
  • /test pull-cluster-api-e2e-workload-upgrade-1-27-latest-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-informing-ipv6-main
  • pull-cluster-api-e2e-informing-main
  • pull-cluster-api-e2e-main
  • pull-cluster-api-test-main
  • pull-cluster-api-test-mink8s-main
  • pull-cluster-api-verify-main

In response to this:

/test help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 30, 2023
@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-workload-upgrade-1-27-latest-main

@chrischdi
Copy link
Member Author

Ah damn, test-infra issue ongoing 🤦

@sbueringer
Copy link
Member

/lgtm

Assuming CI is green once Prow is not overloaded anymore

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 30, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: f699b1c54d14e8372e72941c4fe6f2b3af6257a6

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be added in the template for the machinePools as well? Seems like those would fail at 1.27+

Otherwise looks good to me.

@chrischdi chrischdi force-pushed the pr-fail-swap-on-false branch from c69395a to a651aed Compare May 30, 2023 19:53
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 30, 2023
@k8s-ci-robot k8s-ci-robot requested a review from sbueringer May 30, 2023 19:53
@chrischdi chrischdi force-pushed the pr-fail-swap-on-false branch from a651aed to faf1cb6 Compare May 30, 2023 19:57
@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-workload-upgrade-1-27-latest-main

@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-full-main
/test

@k8s-ci-robot
Copy link
Contributor

@chrischdi: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main
  • /test pull-cluster-api-e2e-full-dualstack-ipv6-main
  • /test pull-cluster-api-e2e-full-main
  • /test pull-cluster-api-e2e-informing-main
  • /test pull-cluster-api-e2e-ipv6-main
  • /test pull-cluster-api-e2e-mink8s-main
  • /test pull-cluster-api-e2e-scale-main-experimental
  • /test pull-cluster-api-e2e-workload-upgrade-1-27-latest-main
  • /test pull-cluster-api-test-mink8s-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-informing-main
  • pull-cluster-api-e2e-main
  • pull-cluster-api-test-main
  • pull-cluster-api-verify-main

In response to this:

/test pull-cluster-api-e2e-full-main
/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-informing-main

@sbueringer
Copy link
Member

sbueringer commented May 31, 2023

Q: We technically have the same issue for v0.4-v1.3 right? We didn't fix it yet as those those tests are not run locally that frequently (?) and it only fails if someone has a Machine where swap is enabled?

Would it make sense to just set it on all versions? If possible I would like to avoid running into this issue again.

@chrischdi
Copy link
Member Author

chrischdi commented May 31, 2023

We don't have it in ci, because there is no swap enabled there.

For <= v1.3, it depends on the used versions of the kind images (dunno if there will be a released kind node image < v1.27 which faces the same issue, also v1.27 is not supported in v1.3).

@sbueringer
Copy link
Member

The change in kind was only done to images for Kubernetes >= 1.27 right? They didn't change the older Kubenetes versions? (just because they are always pushing all versions: https://github.com/kubernetes-sigs/kind/releases/tag/v0.19.0)

@chrischdi
Copy link
Member Author

chrischdi commented May 31, 2023

The change in kind was only done to images for Kubernetes >= 1.27 right? They didn't change the older Kubenetes versions? (just because they are always pushing all versions: https://github.com/kubernetes-sigs/kind/releases/tag/v0.19.0)

This needs to get researched (but looks like that).

@sbueringer
Copy link
Member

Would be good to know before we merge ideally

@sbueringer
Copy link
Member

Just talked to Killian, what do you think about just setting the swap setting everywhere? Shouldn't hurt and we don't have to do more research.

@chrischdi chrischdi force-pushed the pr-fail-swap-on-false branch from faf1cb6 to 639e201 Compare May 31, 2023 14:29
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 31, 2023
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 1, 2023
@chrischdi
Copy link
Member Author

/test help

@k8s-ci-robot
Copy link
Contributor

@chrischdi: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main
  • /test pull-cluster-api-e2e-full-dualstack-ipv6-main
  • /test pull-cluster-api-e2e-full-main
  • /test pull-cluster-api-e2e-informing-main
  • /test pull-cluster-api-e2e-ipv6-main
  • /test pull-cluster-api-e2e-mink8s-main
  • /test pull-cluster-api-e2e-scale-main-experimental
  • /test pull-cluster-api-e2e-workload-upgrade-1-26-1-27-krte-exp
  • /test pull-cluster-api-e2e-workload-upgrade-1-26-1-27-kubekins-exp
  • /test pull-cluster-api-e2e-workload-upgrade-1-27-latest-main
  • /test pull-cluster-api-test-mink8s-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-informing-main
  • pull-cluster-api-e2e-main
  • pull-cluster-api-test-main
  • pull-cluster-api-verify-main

In response to this:

/test help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chrischdi
Copy link
Member Author

/test pull-cluster-api-apidiff-main
/test pull-cluster-api-e2e-full-dualstack-ipv6-main
/test pull-cluster-api-e2e-full-main
/test pull-cluster-api-e2e-informing-main
/test pull-cluster-api-e2e-ipv6-main
/test pull-cluster-api-e2e-mink8s-main
/test pull-cluster-api-e2e-scale-main-experimental
/test pull-cluster-api-e2e-workload-upgrade-1-27-latest-main
/test pull-cluster-api-test-mink8s-main

@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-mink8s-main
/test pull-cluster-api-e2e-scale-main-experimental

@k8s-ci-robot
Copy link
Contributor

@chrischdi: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-e2e-workload-upgrade-1-26-1-27-krte-exp ea267b8 link false /test pull-cluster-api-e2e-workload-upgrade-1-26-1-27-krte-exp
pull-cluster-api-e2e-workload-upgrade-1-26-1-27-kubekins-exp ea267b8 link false /test pull-cluster-api-e2e-workload-upgrade-1-26-1-27-kubekins-exp
pull-cluster-api-e2e-scale-main-experimental c939389 link false /test pull-cluster-api-e2e-scale-main-experimental
pull-cluster-api-e2e-mink8s-main c939389 link false /test pull-cluster-api-e2e-mink8s-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sbueringer
Copy link
Member

@killianmuldoon looks like we have an issue in mink8s. I think we have to clone k/k for this job as well so we can build the node image if necessary

@sbueringer
Copy link
Member

Or maybe we should better pin to a 1.24 version we have a kind image for. I think we don't have to use latest 1.24 and it makes the job quicker

@killianmuldoon
Copy link
Contributor

The way to do this is to set extra_refs in the prow job yaml, right? To bring the presubmit to parity with the periodic job if I understand right.

I think it might be better to let this job build instead of trying to constantly keep track of which version we should be using in the job.

@killianmuldoon
Copy link
Contributor

But because this is specifically a minimum lower bound I'd also be happy to pin it to something like 1.24.2 and just never bump it without reason until we officially bump our mininimum supported K8s version

@sbueringer
Copy link
Member

sbueringer commented Jun 1, 2023

Talked to Killian. We would pin to v1.24.13. Overall it shouldn't matter too much, but with the highest available patch version we have probably better compatibility with the kind version we use and Kubernetes has more bugfixes :).

There is not a big benefit to build 1.24 latest as the goal is to roughly test against "min". It would also just take more time and resources to always build.

@sbueringer
Copy link
Member

Thx!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 1, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 8fc4ba71427a55f530303262b4602e475ce6f97c

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 1, 2023
@k8s-ci-robot k8s-ci-robot merged commit 9be885c into kubernetes-sigs:main Jun 1, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.5 milestone Jun 1, 2023
@chrischdi
Copy link
Member Author

@sbueringer : something worth cherry-picking? I think so!

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbueringer : something worth cherry-picking? I think so!

Probably needs to be done manually - let's see 😄

@killianmuldoon
Copy link
Contributor

/cherry-pick release-1.4

@k8s-infra-cherrypick-robot

@killianmuldoon: #8767 failed to apply on top of branch "release-1.4":

Applying: test/e2e fix fail-swap-on=false flag not being part of kind images anymore
Using index info to reconstruct a base tree...
M	test/e2e/data/infrastructure-docker/main/clusterclass-quick-start.yaml
A	test/e2e/data/infrastructure-docker/v1.4/bases/cluster-with-kcp.yaml
A	test/e2e/data/infrastructure-docker/v1.4/bases/md.yaml
A	test/e2e/data/infrastructure-docker/v1.4/bases/mp.yaml
A	test/e2e/data/infrastructure-docker/v1.4/clusterclass-quick-start.yaml
Falling back to patching base and 3-way merge...
CONFLICT (modify/delete): test/e2e/data/infrastructure-docker/v1.4/bases/mp.yaml deleted in HEAD and modified in test/e2e fix fail-swap-on=false flag not being part of kind images anymore. Version test/e2e fix fail-swap-on=false flag not being part of kind images anymore of test/e2e/data/infrastructure-docker/v1.4/bases/mp.yaml left in tree.
Auto-merging test/e2e/data/infrastructure-docker/v1.2/clusterclass-quick-start.yaml
CONFLICT (content): Merge conflict in test/e2e/data/infrastructure-docker/v1.2/clusterclass-quick-start.yaml
Auto-merging test/e2e/data/infrastructure-docker/v1.2/bases/md.yaml
Auto-merging test/e2e/data/infrastructure-docker/v1.2/bases/cluster-with-kcp.yaml
Auto-merging test/e2e/data/infrastructure-docker/main/clusterclass-quick-start.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 test/e2e fix fail-swap-on=false flag not being part of kind images anymore
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer
Copy link
Member

Fine for me!

@chrischdi
Copy link
Member Author

/area provider/infrastructure-docker

@k8s-ci-robot k8s-ci-robot added the area/provider/infrastructure-docker Issues or PRs related to the docker infrastructure provider label Jun 1, 2023
@chrischdi
Copy link
Member Author

/retitle 🐛 fix fail-swap-on=false flag not being part of kind images anymore

@k8s-ci-robot k8s-ci-robot changed the title 🐛 test/e2e fix fail-swap-on=false flag not being part of kind images anymore 🐛 fix fail-swap-on=false flag not being part of kind images anymore Jun 1, 2023
@chrischdi chrischdi deleted the pr-fail-swap-on-false branch June 1, 2023 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/infrastructure-docker Issues or PRs related to the docker infrastructure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Creating a v1.27.2 cluster on swap enabled host fails
5 participants