-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Bulk Deletes for GCS Repository #41368
Merged
original-brownbear
merged 12 commits into
elastic:master
from
original-brownbear:gcs-batch-delete
Apr 30, 2019
Merged
Changes from 10 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
bc18a17
Implement Bulk Deletes for GCS Repository
original-brownbear b971752
CR: no need for manual batching
original-brownbear 1ae7460
Merge remote-tracking branch 'elastic/master' into gcs-batch-delete
original-brownbear ae58d9d
Merge remote-tracking branch 'elastic/master' into gcs-batch-delete
original-brownbear a029e7b
CR: use low level delete to not have to stat blobs
original-brownbear 4199c2a
Merge remote-tracking branch 'elastic/master' into gcs-batch-delete
original-brownbear 77e5875
CR: rename method and better mock
original-brownbear d5a05fa
CR: collect exceptions
original-brownbear 729e8b9
Merge remote-tracking branch 'elastic/master' into gcs-batch-delete
original-brownbear 1d8d9c0
CR: throw IOException instead of StorageException
original-brownbear 2838d3d
Merge remote-tracking branch 'elastic/master' into gcs-batch-delete
original-brownbear a0e616c
CR: collect list of failed blob names in ex message
original-brownbear File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add a message just as for S3BlobContainer? i.e.
"Exception when deleting blobs [" + blobNames + "]"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consciously didn't do that here, since it's a little different from the AWS case in that we will always submit the full batch of deletes since the sub-batches are internal to the client (unlike in S3 where we do the splitting up of the batch).
So I expect us to always get an exception for every failed blob, making listing them again a little redundant (plus we're catching these exceptions upstream anyway and logging the blobs that we tried to delete)?
Maybe rather remove the listing from S3 as well? (I didn't realize it at the time, but it seems completely redundant when we always log the list of blobs upstream anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be nicer to collect the files that failed deletion here than at the caller site. It allows filtering the list down to the actual file deletions that failed (i.e. instead of just blobNames, we can filter it down to those that actually experienced a failure). A similar thing can be done for S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Maybe keep it all in the exception though and simply make sure that the blob names are all in the exception?
I don't like manually collecting the list of failed deletes without exceptions tbh., it doesn't really give us any information. We want to know why a delete failed. The fact that it failed we can see by listing the stale blobs later on already :P
Then we could simply remove the blobs list from all the upstream logging and be done?
=> how about keeping this like they are here and adjusting the S3 implementation and upstream logging in a subsequent PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok to adapt call sites in a follow-up PR, but let's add the message with all the blob names at least in this PR here.