Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow download retrying if the computed and configured checksums differ. #4455

Merged
merged 1 commit into from
Dec 29, 2024

Conversation

jsfakian
Copy link
Contributor

@jsfakian jsfakian commented Dec 3, 2024

  • We have identified that downloading images under poor networking conditions from datastores might fail silently due to the SDK libraries we use for for Azure and AWS. This happens when we are installing new applications.
  • Another observation we made is that when we manually try re-installing the applications, they eventually succeed in the installation.
  • With this patch, we automate retrying the image download a fixed number of times (5 by default) if the computed versus the configure checksums differ.

@jsfakian jsfakian force-pushed the BUG-CI349-checksum-mismatch branch from c8cf95f to 213e610 Compare December 3, 2024 10:30
@github-actions github-actions bot requested a review from OhmSpectator December 3, 2024 10:31
if config.RefCount != 0 && (status.HasError() || status.State != types.DOWNLOADED) {
// or status is not downloaded or retry then do install
if config.RefCount != 0 && (status.HasError() ||
status.State != types.DOWNLOADED || config.DoRetry) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The State != types.DOWNLOADED condition looks suspicious to me. Isn't it too vague? Shouldn't it be at least status.State < types.DOWNLOADED && ... or even status.State == DOWNLOADING && ... ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was like that before. Should we change it?

@jsfakian jsfakian force-pushed the BUG-CI349-checksum-mismatch branch from 213e610 to 9884a04 Compare December 10, 2024 15:39
@jsfakian jsfakian force-pushed the BUG-CI349-checksum-mismatch branch from 9884a04 to 9110498 Compare December 10, 2024 15:41
@jsfakian jsfakian force-pushed the BUG-CI349-checksum-mismatch branch from 9110498 to 619ad71 Compare December 13, 2024 15:49
@jsfakian jsfakian force-pushed the BUG-CI349-checksum-mismatch branch from 619ad71 to 9d1f5ae Compare December 16, 2024 11:24
- We have identified that downloading images under poor networking conditions
from datastores might fail silently due to the SDK libraries we use for
for Azure and AWS. This happens when we are installing new applications.
- Another observation we made is that when we manually try re-installing
the applications, they eventually succeed in the installation.
- With this patch, we automate retrying the image download a fixed number
of times (5 by default) if the computed versus the configure checksums differ.

Signed-off-by: Ioannis Sfakianakis <[email protected]>
@jsfakian jsfakian force-pushed the BUG-CI349-checksum-mismatch branch from 9d1f5ae to 99af741 Compare December 20, 2024 12:39
Comment on lines +415 to +417
// Retrying the download due to image verification failure
// This happens when: RefCount and retryCount is non-zero,
// and DownloaderStatus state is "downloaded"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this comment move up to where the new code is?

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming you verified that the effect of moving the HasRef inside the function doesn't change anything i.e., that all the call sites to MaybeRemove cleared the boolean before your change.

@eriknordmark eriknordmark merged commit f63bb92 into lf-edge:master Dec 29, 2024
65 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants