Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow opting out of frequent etcd flushes to storage #1266

Merged
merged 1 commit into from
Sep 16, 2024

Conversation

akalenyu
Copy link
Contributor

@akalenyu akalenyu commented Sep 5, 2024

What this PR does / why we need it:
etcd allows opting out of WAL [0] and data [1] fsyncs
(WAL records describe the commit changes and are being flushed to permanent storage very frequently)
which could significantly decrease the load on the worker pod filesystem.
This is (arguably) equivalent in terms of disasters to etcd in memory,
and has seemed to work well for other large projects [1].

As this is behind an env var, and our CI clusters are ephemeral,
there's no harm in evaluating the use of this, in hopes of squeezing
out the maximum of the resources available to us
(non-dedicated storage per worker)

[0] https://www.postgresql.org/docs/current/wal-intro.html
[1] etcd-io/etcd#18357 (comment)
[2] cert-manager/cert-manager@057ce50

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note:

add a configurable for opting out of frequent etcd flushes

@kubevirt-bot kubevirt-bot added the dco-signoff: yes Indicates the PR's author has DCO signed all their commits. label Sep 5, 2024
@akalenyu
Copy link
Contributor Author

akalenyu commented Sep 8, 2024

/retest

@mhenriks
Copy link
Member

mhenriks commented Sep 9, 2024

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 9, 2024
etcd allows opting out of WAL [0] fsyncs
(WAL records describe the commit changes and are being flushed to permanent storage very frequently)
which could significantly decrease the load on the worker pod filesystem.
This is (arguably) equivalent in terms of disasters to etcd in memory,
and has seemed to work well for other large projects [1].

As this is behind an env var, and our CI clusters are ephemeral,
there's no harm in evaluating the use of this, in hopes of squeezing
out the maximum of the resources we available to us.
(non-dedicated storage per worker)

[0] https://www.postgresql.org/docs/current/wal-intro.html
[1] cert-manager/cert-manager@057ce50

Signed-off-by: Alex Kalenyuk <[email protected]>
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 12, 2024
@akalenyu
Copy link
Contributor Author

/lgtm

Just rebased

@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Sep 12, 2024

@akalenyu: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
check-up-kind-1.27-vgpu 51b02e0 link false /test check-up-kind-1.27-vgpu

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Member

@brianmcarey brianmcarey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Thanks @akalenyu - this should be ok to add now as it is only enabled when the env variable is set.

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 16, 2024
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brianmcarey

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 16, 2024
@kubevirt-bot kubevirt-bot merged commit eff1926 into kubevirt:main Sep 16, 2024
9 of 10 checks passed
kubevirt-bot added a commit to kubevirt-bot/kubevirt that referenced this pull request Sep 17, 2024
[fc9be0a Fix prometheus port](kubevirt/kubevirtci#1272)
[eff1926 Allow opting out of frequent etcd flushes to storage](kubevirt/kubevirtci#1266)
[939e610 fix the busybox netcat not found issue in fedora-test-tooling for s390x.](kubevirt/kubevirtci#1268)
[4ad94f0 Label rook storage class](kubevirt/kubevirtci#1267)
[88a4f6b fix: Wait until istio cni files appear before copying them](kubevirt/kubevirtci#1262)
[4cc1018 Run ./hack/bump-cdi.sh](kubevirt/kubevirtci#1249)
[6149a01 Include manifests that are part of gocli when checking for prepull images](kubevirt/kubevirtci#1265)
[1b17b20 fix: Remove extra slash in file path and remove unneeded sed](kubevirt/kubevirtci#1258)
[b35649a Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1257)
[374b5e9 bug: Move namespace and CRD files to names with higher alphabetical order to be created first](kubevirt/kubevirtci#1255)
[79bfd07 Run bazel run //robots/cmd/kubevirtci-bumper:kubevirtci-bumper -- -ensure-only-latest-three --k8s-provider-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-provision/k8s --cluster-up-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-up/cluster](kubevirt/kubevirtci#1256)
[ba145b2 bug: Fix KSM flag passing to the gocli](kubevirt/kubevirtci#1254)
[8501d22 Opts package](kubevirt/kubevirtci#1217)
[4f37d07 Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1250)

```release-note
NONE
```

Signed-off-by: kubevirt-bot <[email protected]>
kubevirt-bot added a commit to kubevirt-bot/kubevirt that referenced this pull request Sep 17, 2024
[fc9be0a Fix prometheus port](kubevirt/kubevirtci#1272)
[eff1926 Allow opting out of frequent etcd flushes to storage](kubevirt/kubevirtci#1266)
[939e610 fix the busybox netcat not found issue in fedora-test-tooling for s390x.](kubevirt/kubevirtci#1268)
[4ad94f0 Label rook storage class](kubevirt/kubevirtci#1267)
[88a4f6b fix: Wait until istio cni files appear before copying them](kubevirt/kubevirtci#1262)
[4cc1018 Run ./hack/bump-cdi.sh](kubevirt/kubevirtci#1249)
[6149a01 Include manifests that are part of gocli when checking for prepull images](kubevirt/kubevirtci#1265)
[1b17b20 fix: Remove extra slash in file path and remove unneeded sed](kubevirt/kubevirtci#1258)
[b35649a Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1257)
[374b5e9 bug: Move namespace and CRD files to names with higher alphabetical order to be created first](kubevirt/kubevirtci#1255)
[79bfd07 Run bazel run //robots/cmd/kubevirtci-bumper:kubevirtci-bumper -- -ensure-only-latest-three --k8s-provider-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-provision/k8s --cluster-up-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-up/cluster](kubevirt/kubevirtci#1256)
[ba145b2 bug: Fix KSM flag passing to the gocli](kubevirt/kubevirtci#1254)
[8501d22 Opts package](kubevirt/kubevirtci#1217)
[4f37d07 Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1250)

```release-note
NONE
```

Signed-off-by: kubevirt-bot <[email protected]>
kubevirt-bot added a commit to kubevirt-bot/kubevirt that referenced this pull request Sep 18, 2024
[fc9be0a Fix prometheus port](kubevirt/kubevirtci#1272)
[eff1926 Allow opting out of frequent etcd flushes to storage](kubevirt/kubevirtci#1266)
[939e610 fix the busybox netcat not found issue in fedora-test-tooling for s390x.](kubevirt/kubevirtci#1268)
[4ad94f0 Label rook storage class](kubevirt/kubevirtci#1267)
[88a4f6b fix: Wait until istio cni files appear before copying them](kubevirt/kubevirtci#1262)
[4cc1018 Run ./hack/bump-cdi.sh](kubevirt/kubevirtci#1249)
[6149a01 Include manifests that are part of gocli when checking for prepull images](kubevirt/kubevirtci#1265)
[1b17b20 fix: Remove extra slash in file path and remove unneeded sed](kubevirt/kubevirtci#1258)
[b35649a Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1257)
[374b5e9 bug: Move namespace and CRD files to names with higher alphabetical order to be created first](kubevirt/kubevirtci#1255)
[79bfd07 Run bazel run //robots/cmd/kubevirtci-bumper:kubevirtci-bumper -- -ensure-only-latest-three --k8s-provider-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-provision/k8s --cluster-up-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-up/cluster](kubevirt/kubevirtci#1256)
[ba145b2 bug: Fix KSM flag passing to the gocli](kubevirt/kubevirtci#1254)
[8501d22 Opts package](kubevirt/kubevirtci#1217)
[4f37d07 Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1250)

```release-note
NONE
```

Signed-off-by: kubevirt-bot <[email protected]>
nirdothan pushed a commit to nirdothan/kubevirt that referenced this pull request Sep 25, 2024
[fc9be0a Fix prometheus port](kubevirt/kubevirtci#1272)
[eff1926 Allow opting out of frequent etcd flushes to storage](kubevirt/kubevirtci#1266)
[939e610 fix the busybox netcat not found issue in fedora-test-tooling for s390x.](kubevirt/kubevirtci#1268)
[4ad94f0 Label rook storage class](kubevirt/kubevirtci#1267)
[88a4f6b fix: Wait until istio cni files appear before copying them](kubevirt/kubevirtci#1262)
[4cc1018 Run ./hack/bump-cdi.sh](kubevirt/kubevirtci#1249)
[6149a01 Include manifests that are part of gocli when checking for prepull images](kubevirt/kubevirtci#1265)
[1b17b20 fix: Remove extra slash in file path and remove unneeded sed](kubevirt/kubevirtci#1258)
[b35649a Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1257)
[374b5e9 bug: Move namespace and CRD files to names with higher alphabetical order to be created first](kubevirt/kubevirtci#1255)
[79bfd07 Run bazel run //robots/cmd/kubevirtci-bumper:kubevirtci-bumper -- -ensure-only-latest-three --k8s-provider-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-provision/k8s --cluster-up-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-up/cluster](kubevirt/kubevirtci#1256)
[ba145b2 bug: Fix KSM flag passing to the gocli](kubevirt/kubevirtci#1254)
[8501d22 Opts package](kubevirt/kubevirtci#1217)
[4f37d07 Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1250)

```release-note
NONE
```

Signed-off-by: kubevirt-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants