-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full snapshot lease update retry on failure #711
Merged
ishan16696
merged 18 commits into
gardener:master
from
anveshreddy18:bug/full-snapshot-lease-update
Mar 15, 2024
Merged
Full snapshot lease update retry on failure #711
ishan16696
merged 18 commits into
gardener:master
from
anveshreddy18:bug/full-snapshot-lease-update
Mar 15, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gardener-robot
added
needs/review
Needs review
size/m
Size of pull request is medium (see gardener-robot robot/bots/size.py)
labels
Jan 30, 2024
anveshreddy18
added
kind/bug
Bug
and removed
size/m
Size of pull request is medium (see gardener-robot robot/bots/size.py)
labels
Jan 30, 2024
gardener-robot-ci-2
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
needs/ok-to-test
Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Jan 30, 2024
gardener-robot
added
the
size/m
Size of pull request is medium (see gardener-robot robot/bots/size.py)
label
Jan 31, 2024
gardener-robot-ci-2
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Jan 31, 2024
ishan16696
reviewed
Jan 31, 2024
gardener-robot-ci-1
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Feb 1, 2024
gardener-robot
added
size/l
Size of pull request is large (see gardener-robot robot/bots/size.py)
needs/second-opinion
Needs second review by someone else
and removed
size/m
Size of pull request is medium (see gardener-robot robot/bots/size.py)
labels
Feb 6, 2024
gardener-robot-ci-1
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Feb 6, 2024
anveshreddy18
force-pushed
the
bug/full-snapshot-lease-update
branch
from
February 6, 2024 06:25
ba80618
to
f696a32
Compare
gardener-robot-ci-1
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Feb 6, 2024
anveshreddy18
added
the
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
label
Feb 6, 2024
gardener-robot-ci-2
removed
the
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
label
Feb 6, 2024
gardener-robot-ci-1
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Feb 6, 2024
ishan16696
requested changes
Feb 8, 2024
gardener-robot-ci-2
added
the
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
label
Mar 13, 2024
gardener-robot-ci-3
removed
the
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
label
Mar 13, 2024
gardener-robot-ci-3
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Mar 13, 2024
gardener-robot-ci-3
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Mar 13, 2024
gardener-robot-ci-2
added
the
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
label
Mar 14, 2024
gardener-robot-ci-1
removed
the
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
label
Mar 14, 2024
anveshreddy18
force-pushed
the
bug/full-snapshot-lease-update
branch
from
March 14, 2024 12:52
e133c02
to
4a797a1
Compare
gardener-robot-ci-1
added
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
and removed
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
labels
Mar 14, 2024
ishan16696
requested changes
Mar 15, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, just few nits.
anveshreddy18
force-pushed
the
bug/full-snapshot-lease-update
branch
from
March 15, 2024 03:43
4a797a1
to
883a63c
Compare
gardener-robot-ci-3
added
the
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
label
Mar 15, 2024
gardener-robot-ci-1
removed
the
reviewed/ok-to-test
Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
label
Mar 15, 2024
ishan16696
approved these changes
Mar 15, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!
renormalize
approved these changes
Mar 15, 2024
gardener-robot
added
the
status/closed
Issue is closed (either delivered or triaged)
label
Mar 15, 2024
This was referenced Jul 1, 2024
renormalize
pushed a commit
to renormalize/etcd-backup-restore
that referenced
this pull request
Jul 4, 2024
* Full snapshot lease update retry on failure * nit changes * Address review comments by @ishan16696 * Added unit tests for RenewFullSnapshotLeasePeriodically() func * check unit tests on prow * Address review comments * minor change in logs * Add a snapshotter method to set lease update interval * nit change * Address review comments * Resolve unit tests failure * Improve interval time for unit tests * nit change * Address review comments by @ishan16696 * Address review comments by @ishan16696 * make tests pass * nit change * Address review comments by @ishan16696
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
kind/bug
Bug
needs/changes
Needs (more) changes
needs/ok-to-test
Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
needs/review
Needs review
needs/second-opinion
Needs second review by someone else
size/l
Size of pull request is large (see gardener-robot robot/bots/size.py)
status/closed
Issue is closed (either delivered or triaged)
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Currently a Full Snapshot Lease update is triggered when a Full Snapshot ( either scheduled or out-of-schedule ) is taken. And as the Full Snapshot is taken every 24 hrs ( configurable ), if there ever is a failure in updating the Full Snapshot lease, it has to wait for the next full snapshot for it to get updated, which is a long time to wait in this case. This creates a problem which is well documented in this issue by @unmarshall, thanks for that!.
This PR attempts to update the full snapshot lease by periodically trying to update it with an interval defined by
FullSnapshotLeaseUpdateInterval
in the snapshotter.healthConfig. The retry stops once the lease is upto date, so as to not make unnecessary calls to API server. Basically ensuring that full snapshot lease is upto date for most of the time.Which issue(s) this PR fixes:
Fixes #678
NOTE: With the etcd-druid PR#764 getting merged, it allows to configure the
fullSnapshotLeaseUpdateInterval
from Etcd yaml.Special notes for your reviewer:
To test this with kind setup, remove the
get
option under lease from the role used byetcd-test
serviceaccount and trigger a full snapshot, the snapshotter won't be able to fetch the lease hence failure, check the backup-restore logs, now insert theget
back and see the lease getting updated in the next call to lease update, and the periodic retry stops. Tip: decrease theFullSnapshotLeaseUpdateInterval
time to 1 minute to make this process faster.Release note: