Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async Snapshot Repository Deletes #40144

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Mar 18, 2019

Motivated by slow snapshot deletes reported in e.g. #39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete.

  • Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll
    • I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable.
  • See Making snapshot deletes distributed across data nodes. #39656 (comment)
  • Also, as a side effect this gives the SnapshotResiliencyTests a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore).
  • By adding a ThreadPool reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in Introducing a new snapshot segments threadpool to uploads segments of shards in parallel #39657

@original-brownbear original-brownbear added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 labels Mar 18, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case where a snapshot deletion is running, it takes over the snapshot threadpool on the master, blocking "create snapshot" requests, which are also dispatched in TransportCreateSnapshotAction to the snapshot threadpool. I think we should have these requests come in on the generic threadpool and then go to the snapshot threadpool after actually checking on the cluster state update thread whether there is any ongoing deletes.

Also, while looking at SnapshotsService.deleteSnapshot I've noticed that we have callers of that method that just swallow the returned exception instead of calling the listener, grep e.g. for deleteSnapshot(snapshot.getRepository(), snapshot.getSnapshotId().getName(), listener, true);. Also worth fixing in a separate PR.

}
final AtomicInteger outstandingIndices = new AtomicInteger(indices.size());
for (String index : indices) {
threadPool.executor(ThreadPool.Names.SNAPSHOT).execute(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ActionRunnable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can further parallelize this to the shard level I think (can be a follow-up)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ywelsch yea I wanted to wait on #40250 for that, because we may want to push the specifics of that down to the blob container and handle the parallelization dependent on whether or not bulk deletes are supported.

"but failed to clean up its index folder due to the directory not being empty.", metadata.name(), indexId), dnee);
} catch (IOException ioe) {
} catch (Exception e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a conscious choice, I'd actually rather log all exceptions here directly + it simplifies things since we don't need to use the ActionRunnable here if we catch everything. I adjusted the logic a little though and invoked the grouped listener in a finally to make things a little clearer (and avoid the potential odd block from a future assertion trippin somewhere).

@original-brownbear
Copy link
Member Author

@ywelsch all points addressed, left ~2 questions though.

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that we need to switch TransportRestoreSnapshotAction to generic thread pool as well and then, once cluster state is checked, switch it to "snapshot" thread pool. @ywelsh do you agree?

listener.onFailure(new RepositoryException(metadata.name(), "failed to delete snapshot [" + snapshotId + "]", ex));
return;
}
deleteSnapshotBlobs(snapshot, snapshotId, repositoryData, updatedRepositoryData, listener);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, there are three steps: deleteSnapshotBlobs, deleteIndices, deleteUnreferencedIndices. It would be nice to see this function calls in this method. Something like this:

deleteSnapshotBlobs(...,
                ActionListener.wrap(() -> deleteIndices(
                        ActionListener.wrap(() -> deleteUnreferencedIndices(..., listener))
                )));

I'm not sure I undestand your comment about "used in multiple places"

@original-brownbear
Copy link
Member Author

@andrershov comments addressed :)

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/bwc

Copy link
Contributor

@andrershov andrershov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

@andrershov thanks!
@ywelsch are you good with this one too? :)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some smaller questions, looking good otherwise.

@@ -49,7 +49,7 @@ public TransportCreateSnapshotAction(TransportService transportService, ClusterS

@Override
protected String executor() {
return ThreadPool.Names.SNAPSHOT;
return ThreadPool.Names.GENERIC;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment as to why we use generic instead of snapshot here? (same for TransportRestoreSnapshotAction). Otherwise, someone in the future might just set this back to snapshot.

@@ -161,6 +166,10 @@

protected final RepositoryMetaData metadata;

protected final NamedXContentRegistry namedXContentRegistry;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this field is not used anywhere AFAICS

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad ... already killed it in a previous PR but it snuck back in here from a merge mistake :)

blobContainer().deleteBlobsIgnoringIfNotExists(
Arrays.asList(snapshotFormat.blobName(snapshotId.getUUID()), globalMetaDataFormat.blobName(snapshotId.getUUID())));
} catch (IOException e) {
logger.warn(() -> new ParameterizedMessage("[{}] Unable to delete global metadata files", snapshotId), e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something we should not change in this PR, but I wonder if we should not be lenient here when it comes to IOException. What we're effectively doing here is creating the possibility to have a top-level entry that's referencing stuff that is deleted :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of, we have https://github.com/elastic/elasticsearch/pull/40144/files/36b2e3888b18745edf4bb4d6e4af4c5b3165f174#diff-83e9ae765eb2a80bbbedd251b686cc10R440 writing the latest global meta-data before this happens, so we don't have a chain of references from the root to these blobs anymore regardless. But I agree, this should be retried (but as I liked in my update, that's incoming shortly too :))


deleteIndexMetaDataBlobIgnoringErrors(snapshot, indexId);

deleteIndexMetaDataBlobIgnoringErrors(snapshotId, indexId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, unrelated to the PR, but we could think in the future of moving this to a later phase, as deleting this here makes the rest unretriable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, that's incoming in master...original-brownbear:delete-lock-via-cs shortly :)

snapshotId,
ActionListener.wrap(v -> {
try {
blobStore().blobContainer(basePath().add("indices")).deleteBlobsIgnoringIfNotExists(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have tests that check that the indices folders are cleaned up for FSRepository? Might be good to add some if we don't, so that we don't inadvertently break this.

Copy link
Member Author

@original-brownbear original-brownbear Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a few that check that the path is gone, but I'll also open a PR that has more comprehensive tests for the state of the repository from https://github.com/elastic/elasticsearch/compare/master...original-brownbear:resilient-deletes-test?expand=1 (50%ish done) very shortly :)

.map(info -> info.indices().stream().map(repositoryData::resolveIndexId).collect(Collectors.toList()))
.orElse(Collections.emptyList()),
snapshotId,
ActionListener.wrap(v -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this not equivalent to the simpler ActionListener.map(listener, v -> { ... })?

@original-brownbear
Copy link
Member Author

@ywelsch all points addressed now I think :)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

Thanks @andrershov and @ywelsch !

@original-brownbear original-brownbear merged commit 8a07522 into elastic:master Apr 5, 2019
@original-brownbear original-brownbear deleted the async-repository-deletes branch April 5, 2019 04:56
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Apr 5, 2019
* elastic/master: (36 commits)
  Remove unneded cluster config from test (elastic#40856)
  Make Fuzziness reject illegal values earlier (elastic#33511)
  Remove test-only customisation from TransReplAct (elastic#40863)
  Fix dense/sparse vector limit documentation (elastic#40852)
  Make -try xlint warning disabled by default. (elastic#40833)
  Async Snapshot Repository Deletes (elastic#40144)
  Revert "Replace usages RandomizedTestingTask with built-in Gradle Test (elastic#40564)"
  Init global checkpoint after copy commit in peer recovery (elastic#40823)
  Replace usages RandomizedTestingTask with built-in Gradle Test (elastic#40564)
  [DOCS] Removed redundant (not quite right) information about upgrades.
  Remove string usages of old transport settings (elastic#40818)
  Rollup ignores time_zone on date histogram (elastic#40844)
  HLRC: fix uri encode bug when url path starts with '/' (elastic#34436)
  Mutes GatewayIndexStateIT.testRecoverBrokenIndexMetadata
  Docs: Pin two IDs in the rest client (elastic#40785)
  Adds version 6.7.2
  [DOCS] Remind users to include @ symbol when applying license file (elastic#40688)
  HLRC: Convert xpack methods to client side objects (elastic#40705)
  Allow ILM to stop if indices have nonexistent policies (elastic#40820)
  Add an OpenID Connect authentication realm (elastic#40674)
  ...
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 26, 2019
Motivated by slow snapshot deletes reported in e.g. elastic#39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete.

* Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll
   * I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable.
* See elastic#39656 (comment)
* Also, as a side effect this gives the `SnapshotResiliencyTests` a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore).
* By adding a `ThreadPool` reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in elastic#39657
original-brownbear added a commit that referenced this pull request Apr 26, 2019
Motivated by slow snapshot deletes reported in e.g. #39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete.

* Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll
   * I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable.
* See #39656 (comment)
* Also, as a side effect this gives the `SnapshotResiliencyTests` a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore).
* By adding a `ThreadPool` reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in #39657
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Motivated by slow snapshot deletes reported in e.g. elastic#39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete. 

* Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll
   * I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable. 
* See elastic#39656 (comment)
* Also, as a side effect this gives the `SnapshotResiliencyTests` a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore).
* By adding a `ThreadPool` reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in elastic#39657
mkleen added a commit to crate/crate that referenced this pull request May 28, 2020
mkleen added a commit to crate/crate that referenced this pull request May 28, 2020
mergify bot pushed a commit to crate/crate that referenced this pull request May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants