-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow opting out of frequent etcd flushes to storage #1266
Conversation
/retest |
/lgtm |
etcd allows opting out of WAL [0] fsyncs (WAL records describe the commit changes and are being flushed to permanent storage very frequently) which could significantly decrease the load on the worker pod filesystem. This is (arguably) equivalent in terms of disasters to etcd in memory, and has seemed to work well for other large projects [1]. As this is behind an env var, and our CI clusters are ephemeral, there's no harm in evaluating the use of this, in hopes of squeezing out the maximum of the resources we available to us. (non-dedicated storage per worker) [0] https://www.postgresql.org/docs/current/wal-intro.html [1] cert-manager/cert-manager@057ce50 Signed-off-by: Alex Kalenyuk <[email protected]>
6a5cae1
to
51b02e0
Compare
Just rebased |
@akalenyu: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
Thanks @akalenyu - this should be ok to add now as it is only enabled when the env variable is set.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: brianmcarey The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[fc9be0a Fix prometheus port](kubevirt/kubevirtci#1272) [eff1926 Allow opting out of frequent etcd flushes to storage](kubevirt/kubevirtci#1266) [939e610 fix the busybox netcat not found issue in fedora-test-tooling for s390x.](kubevirt/kubevirtci#1268) [4ad94f0 Label rook storage class](kubevirt/kubevirtci#1267) [88a4f6b fix: Wait until istio cni files appear before copying them](kubevirt/kubevirtci#1262) [4cc1018 Run ./hack/bump-cdi.sh](kubevirt/kubevirtci#1249) [6149a01 Include manifests that are part of gocli when checking for prepull images](kubevirt/kubevirtci#1265) [1b17b20 fix: Remove extra slash in file path and remove unneeded sed](kubevirt/kubevirtci#1258) [b35649a Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1257) [374b5e9 bug: Move namespace and CRD files to names with higher alphabetical order to be created first](kubevirt/kubevirtci#1255) [79bfd07 Run bazel run //robots/cmd/kubevirtci-bumper:kubevirtci-bumper -- -ensure-only-latest-three --k8s-provider-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-provision/k8s --cluster-up-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-up/cluster](kubevirt/kubevirtci#1256) [ba145b2 bug: Fix KSM flag passing to the gocli](kubevirt/kubevirtci#1254) [8501d22 Opts package](kubevirt/kubevirtci#1217) [4f37d07 Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1250) ```release-note NONE ``` Signed-off-by: kubevirt-bot <[email protected]>
[fc9be0a Fix prometheus port](kubevirt/kubevirtci#1272) [eff1926 Allow opting out of frequent etcd flushes to storage](kubevirt/kubevirtci#1266) [939e610 fix the busybox netcat not found issue in fedora-test-tooling for s390x.](kubevirt/kubevirtci#1268) [4ad94f0 Label rook storage class](kubevirt/kubevirtci#1267) [88a4f6b fix: Wait until istio cni files appear before copying them](kubevirt/kubevirtci#1262) [4cc1018 Run ./hack/bump-cdi.sh](kubevirt/kubevirtci#1249) [6149a01 Include manifests that are part of gocli when checking for prepull images](kubevirt/kubevirtci#1265) [1b17b20 fix: Remove extra slash in file path and remove unneeded sed](kubevirt/kubevirtci#1258) [b35649a Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1257) [374b5e9 bug: Move namespace and CRD files to names with higher alphabetical order to be created first](kubevirt/kubevirtci#1255) [79bfd07 Run bazel run //robots/cmd/kubevirtci-bumper:kubevirtci-bumper -- -ensure-only-latest-three --k8s-provider-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-provision/k8s --cluster-up-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-up/cluster](kubevirt/kubevirtci#1256) [ba145b2 bug: Fix KSM flag passing to the gocli](kubevirt/kubevirtci#1254) [8501d22 Opts package](kubevirt/kubevirtci#1217) [4f37d07 Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1250) ```release-note NONE ``` Signed-off-by: kubevirt-bot <[email protected]>
[fc9be0a Fix prometheus port](kubevirt/kubevirtci#1272) [eff1926 Allow opting out of frequent etcd flushes to storage](kubevirt/kubevirtci#1266) [939e610 fix the busybox netcat not found issue in fedora-test-tooling for s390x.](kubevirt/kubevirtci#1268) [4ad94f0 Label rook storage class](kubevirt/kubevirtci#1267) [88a4f6b fix: Wait until istio cni files appear before copying them](kubevirt/kubevirtci#1262) [4cc1018 Run ./hack/bump-cdi.sh](kubevirt/kubevirtci#1249) [6149a01 Include manifests that are part of gocli when checking for prepull images](kubevirt/kubevirtci#1265) [1b17b20 fix: Remove extra slash in file path and remove unneeded sed](kubevirt/kubevirtci#1258) [b35649a Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1257) [374b5e9 bug: Move namespace and CRD files to names with higher alphabetical order to be created first](kubevirt/kubevirtci#1255) [79bfd07 Run bazel run //robots/cmd/kubevirtci-bumper:kubevirtci-bumper -- -ensure-only-latest-three --k8s-provider-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-provision/k8s --cluster-up-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-up/cluster](kubevirt/kubevirtci#1256) [ba145b2 bug: Fix KSM flag passing to the gocli](kubevirt/kubevirtci#1254) [8501d22 Opts package](kubevirt/kubevirtci#1217) [4f37d07 Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1250) ```release-note NONE ``` Signed-off-by: kubevirt-bot <[email protected]>
[fc9be0a Fix prometheus port](kubevirt/kubevirtci#1272) [eff1926 Allow opting out of frequent etcd flushes to storage](kubevirt/kubevirtci#1266) [939e610 fix the busybox netcat not found issue in fedora-test-tooling for s390x.](kubevirt/kubevirtci#1268) [4ad94f0 Label rook storage class](kubevirt/kubevirtci#1267) [88a4f6b fix: Wait until istio cni files appear before copying them](kubevirt/kubevirtci#1262) [4cc1018 Run ./hack/bump-cdi.sh](kubevirt/kubevirtci#1249) [6149a01 Include manifests that are part of gocli when checking for prepull images](kubevirt/kubevirtci#1265) [1b17b20 fix: Remove extra slash in file path and remove unneeded sed](kubevirt/kubevirtci#1258) [b35649a Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1257) [374b5e9 bug: Move namespace and CRD files to names with higher alphabetical order to be created first](kubevirt/kubevirtci#1255) [79bfd07 Run bazel run //robots/cmd/kubevirtci-bumper:kubevirtci-bumper -- -ensure-only-latest-three --k8s-provider-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-provision/k8s --cluster-up-dir /home/prow/go/src/github.com/kubevirt/project-infra/../kubevirtci/cluster-up/cluster](kubevirt/kubevirtci#1256) [ba145b2 bug: Fix KSM flag passing to the gocli](kubevirt/kubevirtci#1254) [8501d22 Opts package](kubevirt/kubevirtci#1217) [4f37d07 Automatic bump of CentOS Stream to latest](kubevirt/kubevirtci#1250) ```release-note NONE ``` Signed-off-by: kubevirt-bot <[email protected]>
What this PR does / why we need it:
etcd allows opting out of WAL [0] and data [1] fsyncs
(WAL records describe the commit changes and are being flushed to permanent storage very frequently)
which could significantly decrease the load on the worker pod filesystem.
This is (arguably) equivalent in terms of disasters to etcd in memory,
and has seemed to work well for other large projects [1].
As this is behind an env var, and our CI clusters are ephemeral,
there's no harm in evaluating the use of this, in hopes of squeezing
out the maximum of the resources available to us
(non-dedicated storage per worker)
[0] https://www.postgresql.org/docs/current/wal-intro.html
[1] etcd-io/etcd#18357 (comment)
[2] cert-manager/cert-manager@057ce50
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Checklist
This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.
Release note: