-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Bulk Deletes for GCS Repository #41368
Implement Bulk Deletes for GCS Repository #41368
Conversation
original-brownbear
commented
Apr 19, 2019
- Just like Add Bulk Delete Api to BlobStore #40322 for AWS
- We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it
- Made the bulk delete API also compliant with our interface that only suppresses errors about non-existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex)
- Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
* Just like elastic#40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
Pinging @elastic/es-distributed |
...sitory-gcs/src/main/java/org/elasticsearch/repositories/gcs/GoogleCloudStorageBlobStore.java
Outdated
Show resolved
Hide resolved
...sitory-gcs/src/main/java/org/elasticsearch/repositories/gcs/GoogleCloudStorageBlobStore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS, Google's SDK already implements the max batch size (see HttpStorageRpc.DefaultRpcBatch
) so no need to manually implement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@ywelsch you're good with this one too? :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS, what we would like to implement is the deleteBlobsIgnoringIfNotExists
method. The behavior of List<Boolean> delete(Iterable<BlobId> blobIds)
method in the Storage
class of Google's SDK equate's failure and missing file:
@return an immutable list of booleans. If a blob has been deleted the corresponding item in the
* list is {@code true}. If a blob was not found, deletion failed or access to the resource
* was denied the corresponding item is {@code false}.
We want the distinction between the two, and not throw an exception if the file is just missing. This can be achieved by using the lower-level batch()
method, building the batch yourself and then plugging in a different error handling mapping.
We should also have tests that check that this behavior is correctly implemented by any deleteBlobsIgnoringIfNotExists
method.
On a related note, we should delete the "delete" method from the BlobStore interface, and replace it's single use by a blobcontainer.
...sitory-gcs/src/main/java/org/elasticsearch/repositories/gcs/GoogleCloudStorageBlobStore.java
Outdated
Show resolved
Hide resolved
...sitory-gcs/src/main/java/org/elasticsearch/repositories/gcs/GoogleCloudStorageBlobStore.java
Outdated
Show resolved
Hide resolved
Jenkins run elasticsearch-ci/1 |
1 similar comment
Jenkins run elasticsearch-ci/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also have tests that check that this behavior is correctly implemented by any deleteBlobsIgnoringIfNotExists method.
Can you also address that one? In particular, I'm thinking about ESBlobStoreContainerTestCase
protected String buildKey(String blobName) { | ||
@Override | ||
public void deleteBlobsIgnoringIfNotExists(List<String> blobNames) throws IOException { | ||
blobStore.deleteBlobs(blobNames.stream().map(this::buildKey).collect(Collectors.toList())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the method on blobstore should now also be renamed to deleteBlobsIgnoringIfNotExists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
@Override | ||
public void error(StorageException exception) { | ||
if (exception.getCode() == 404) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use the constant here: HttpURLConnection.HTTP_NOT_FOUND
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure :)
if (exception.getCode() == 404) { | ||
results.add(Boolean.TRUE); | ||
} else { | ||
results.add(Boolean.FALSE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to have this consistent with the S3 implementation, which currently collects the exceptions and add them as suppressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
failed = true; | ||
} | ||
} | ||
if (failed) { | ||
throw new IOException("Failed to delete all [" + blobIdsToDelete.size() + "] blobs"); | ||
throw new IOException("Failed to delete the following blobs " + blobIdsToDelete); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps just "Failed to delete blobs " + blobIdsToDelete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
callback.success(true); | ||
return null; | ||
}).when(result).notify(any(BatchResult.Callback.class)); | ||
when(batch.delete(any(BlobId.class), anyVararg())).thenAnswer(invocation -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we somehow check that no other method is called on this mock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jup done :)
return ioe.get(); | ||
}); | ||
if (e != null) { | ||
throw e; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this now throws the StorageException, and not an IOException like S3's BlobContainer. Let's use one pattern (the one we had for S3), not two different ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, fixed in 1d8d9c0 :)
Jenkins run elasticsearch-ci/packaging-sample |
return ioe.get(); | ||
}); | ||
if (e != null) { | ||
throw new IOException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add a message just as for S3BlobContainer? i.e. "Exception when deleting blobs [" + blobNames + "]"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consciously didn't do that here, since it's a little different from the AWS case in that we will always submit the full batch of deletes since the sub-batches are internal to the client (unlike in S3 where we do the splitting up of the batch).
So I expect us to always get an exception for every failed blob, making listing them again a little redundant (plus we're catching these exceptions upstream anyway and logging the blobs that we tried to delete)?
Maybe rather remove the listing from S3 as well? (I didn't realize it at the time, but it seems completely redundant when we always log the list of blobs upstream anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be nicer to collect the files that failed deletion here than at the caller site. It allows filtering the list down to the actual file deletions that failed (i.e. instead of just blobNames, we can filter it down to those that actually experienced a failure). A similar thing can be done for S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Maybe keep it all in the exception though and simply make sure that the blob names are all in the exception?
I don't like manually collecting the list of failed deletes without exceptions tbh., it doesn't really give us any information. We want to know why a delete failed. The fact that it failed we can see by listing the stale blobs later on already :P
Then we could simply remove the blobs list from all the upstream logging and be done?
=> how about keeping this like they are here and adjusting the S3 implementation and upstream logging in a subsequent PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok to adapt call sites in a follow-up PR, but let's add the message with all the blob names at least in this PR here.
thanks @ywelsch, @andrershov ! |
* Implement Bulk Deletes for GCS Repository * Just like elastic#40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
* Implement Bulk Deletes for GCS Repository (#41368) * Just like #40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations back port of #41368
* Follow up to elastic#41368 * Collect all failed blob deletes and add them to the exception message * Remove logging of blob name list from caller exception logging
* Implement Bulk Deletes for GCS Repository * Just like elastic#40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
* Cleanup Bulk Delete Exception Logging * Follow up to #41368 * Collect all failed blob deletes and add them to the exception message * Remove logging of blob name list from caller exception logging
* Implement Bulk Deletes for GCS Repository * Just like elastic#40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
* Cleanup Bulk Delete Exception Logging * Follow up to elastic#41368 * Collect all failed blob deletes and add them to the exception message * Remove logging of blob name list from caller exception logging
* Cleanup Bulk Delete Exception Logging * Follow up to elastic#41368 * Collect all failed blob deletes and add them to the exception message * Remove logging of blob name list from caller exception logging