Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuild version map when opening internal engine #43202

Merged
merged 11 commits into from
Jun 17, 2019

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jun 13, 2019

With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index.

Relates #40741
Supersedes #42979

To minimize this PR, I will work on two follow-ups:

  1. Adjust testLookupSeqNoByIdInLucene
  2. Remove MapperService parameter from readHistoryOperations estimateNumberOfHistoryOperations since we have it in EngineConfig now.

@dnhatn dnhatn added >enhancement :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v8.0.0 v7.3.0 labels Jun 13, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@@ -5856,4 +5888,37 @@ public void testPruneAwayDeletedButRetainedIds() throws Exception {
}
}
}

public void testRecoverFromLocalTranslog() throws Exception {
Copy link
Member Author

@dnhatn dnhatn Jun 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails on master and 7.x (1 of 5000) without this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool can we try to make a test that fails more often?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test fails when we have out of order in the safe commit and a merge prunes _id. This test index/delete/flush/merge randomly then restarting an engine should yield the same documents as before. I can make it fail more happen but I prefer to leave as it is (hope we can catch other situations). Are you okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there other tests that are failing more often than 1 in 5000, to make sure that this change has enough coverage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can add that test in a follow-up.

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice change I left some comments

this.lastRefreshedCheckpointListener = new LastRefreshedCheckpointListener(localCheckpointTracker.getCheckpoint());
this.internalSearcherManager.addListener(lastRefreshedCheckpointListener);
maxSeqNoOfUpdatesOrDeletes = new AtomicLong(SequenceNumbers.max(localCheckpointTracker.getMaxSeqNo(), translog.getMaxSeqNo()));
if (softDeleteEnabled && localCheckpointTracker.getCheckpoint() < localCheckpointTracker.getMaxSeqNo()) {
try (Searcher searcher = acquireSearcher("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL)) {
restoreVersionMapAndCheckpointTracker(Lucene.wrapAllDocsLive(searcher.getDirectoryReader()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we actually call flush after that to make sure we commit and clear the version map again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we actually skip this entire dance if we don't have any deletes in the index? just curious...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we actually skip this entire dance if we don't have any deletes in the index? just curious...

Yes, we can. If there's no delete, the version map will empty after this method. For the local checkpoint, I think we would like to have it in sync Lucene index. Yannick, WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we actually skip this entire dance if we don't have any deletes in the index? just curious...

I think we could even skip it if there are no soft-deletes in the range we're looking for (i.e. from local checkpoint of safe commit upwards). I would like to avoid special-casing this too much though, and think we should just always do the full thing here. This should be very fast, given that this is a subset of the range that we will have to replay from the translog, which will certainly take much longer.

@@ -5856,4 +5888,37 @@ public void testPruneAwayDeletedButRetainedIds() throws Exception {
}
}
}

public void testRecoverFromLocalTranslog() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool can we try to make a test that fails more often?

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I've left some comments. Did not look at the tests yet.

@dnhatn
Copy link
Member Author

dnhatn commented Jun 14, 2019

@s1monw @ywelsch Thanks for your input. This is ready for another round.

@dnhatn dnhatn requested review from s1monw and ywelsch June 14, 2019 03:32
}
}
// remove live entries in the version map
refresh("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this just calling maybePruneDeletes, which seems to only run if X amount of time has passed after the last pruning? Isn't this a problem in general if we want to mainly do pruning based on local checkpoint?

Copy link
Member Author

@dnhatn dnhatn Jun 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can manually call versionMap.pruneTombstones(Long.MAX_VALUE, getLocalCheckpoint()); after the refresh. However, I expect it won't prune anything since all delete tombstones should be above the local checkpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local checkpoint in the commit is a conservative approximation of the the true local checkpoint, but you're right, this should not lead to much pruning here. I'm generally concerned with the maybePruneDeletes method, which only seems to care about timestamps, and not local checkpoint advancement. Not something we need to solve in this PR, just a more general observation.

this.lastRefreshedCheckpointListener = new LastRefreshedCheckpointListener(localCheckpointTracker.getCheckpoint());
this.internalSearcherManager.addListener(lastRefreshedCheckpointListener);
maxSeqNoOfUpdatesOrDeletes = new AtomicLong(SequenceNumbers.max(localCheckpointTracker.getMaxSeqNo(), translog.getMaxSeqNo()));
if (softDeleteEnabled && localCheckpointTracker.getCheckpoint() < localCheckpointTracker.getMaxSeqNo()) {
try (Searcher searcher = acquireSearcher("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL)) {
restoreVersionMapAndCheckpointTracker(Lucene.wrapAllDocsLive(searcher.getDirectoryReader()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we actually skip this entire dance if we don't have any deletes in the index? just curious...

I think we could even skip it if there are no soft-deletes in the range we're looking for (i.e. from local checkpoint of safe commit upwards). I would like to avoid special-casing this too much though, and think we should just always do the full thing here. This should be very fast, given that this is a subset of the range that we will have to replay from the translog, which will certainly take much longer.

@dnhatn
Copy link
Member Author

dnhatn commented Jun 14, 2019

@s1monw @ywelsch I pushed changes. Can you please take another look? Thank you!

@dnhatn dnhatn requested review from s1monw and ywelsch June 14, 2019 13:37
Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}
}
// remove live entries in the version map
refresh("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local checkpoint in the commit is a conservative approximation of the the true local checkpoint, but you're right, this should not lead to much pruning here. I'm generally concerned with the maybePruneDeletes method, which only seems to care about timestamps, and not local checkpoint advancement. Not something we need to solve in this PR, just a more general observation.

}

public String getId() {
return id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this throw an exception if visited == false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

visited can be false for Noops which do not have _id.

@@ -5856,4 +5888,37 @@ public void testPruneAwayDeletedButRetainedIds() throws Exception {
}
}
}

public void testRecoverFromLocalTranslog() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there other tests that are failing more often than 1 in 5000, to make sure that this change has enough coverage?

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I left one optional comment.

}
}
// remove live entries in the version map
refresh("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this runs in the constructor, I wonder if it was better to simply clear the live version maps rather than call refresh here? Calling refresh calls listeners and we thus call those while the engine is half constructed (especially when the actual engine is the FollowingEngine). I found no cases where this would be problematic currently, so will leave it up to you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have prototyped a change for this but I think it requires a full review. I will work on a follow-up.

@dnhatn
Copy link
Member Author

dnhatn commented Jun 17, 2019

@s1monw @ywelsch @henningandersen Thanks a lot for your great feedback and proposing this idea.

@dnhatn dnhatn merged commit bbc29bb into elastic:master Jun 17, 2019
@dnhatn dnhatn deleted the rebuild-version-map branch June 17, 2019 21:50
dnhatn added a commit that referenced this pull request Jun 17, 2019
With this change, we will rebuild the live version map and local
checkpoint using documents (including soft-deleted) from the safe commit
when opening an internal engine. This allows us to safely prune away _id
of all soft-deleted documents as the version map is always in-sync with
the Lucene index.

Relates #40741
Supersedes #42979
dnhatn added a commit that referenced this pull request Jun 19, 2019
This PR reverts #35230.

Previously, we reply on soft-deletes to fill the mismatch between the
version map and the Lucene index. This is no longer needed after #43202
where we rebuild the version map when opening an engine. Moreover,
PrunePostingsMergePolicy can prune _id of soft-deleted documents out of
order; thus the lookup result including soft-deletes sometimes does not
return the latest version (although it's okay as we only use a valid
result in an engine).

With this change, we use only live documents in Lucene to resolve the
indexing strategy. This is perfectly safe since we keep all deleted
documents after the local checkpoint in the version map.

Closes #42979
dnhatn added a commit that referenced this pull request Jun 19, 2019
This PR reverts #35230.

Previously, we reply on soft-deletes to fill the mismatch between the
version map and the Lucene index. This is no longer needed after #43202
where we rebuild the version map when opening an engine. Moreover,
PrunePostingsMergePolicy can prune _id of soft-deleted documents out of
order; thus the lookup result including soft-deletes sometimes does not
return the latest version (although it's okay as we only use a valid
result in an engine).

With this change, we use only live documents in Lucene to resolve the
indexing strategy. This is perfectly safe since we keep all deleted
documents after the local checkpoint in the version map.

Closes #42979
marregui added a commit to crate/crate that referenced this pull request Nov 28, 2019
With this change, we will rebuild the live version map and local checkpoint using
documents (including soft-deleted) from the safe commit when opening an internal
engine. This allows us to safely prune away _id of all soft-deleted documents as
the version map is always in-sync with Lucene index.

Relates #40741 (elastic/elasticsearch#40741)
Supersedes #42979 (elastic/elasticsearch#42979)

Port of elastic/elasticsearch#43202
marregui added a commit to crate/crate that referenced this pull request Nov 29, 2019
With this change, we will rebuild the live version map and local checkpoint using
documents (including soft-deleted) from the safe commit when opening an internal
engine. This allows us to safely prune away _id of all soft-deleted documents as
the version map is always in-sync with Lucene index.

Relates #40741 (elastic/elasticsearch#40741)
Supersedes #42979 (elastic/elasticsearch#42979)

Port of elastic/elasticsearch#43202
marregui added a commit to crate/crate that referenced this pull request Dec 3, 2019
With this change, we will rebuild the live version map and local checkpoint using
documents (including soft-deleted) from the safe commit when opening an internal
engine. This allows us to safely prune away _id of all soft-deleted documents as
the version map is always in-sync with Lucene index.

Relates #40741 (elastic/elasticsearch#40741)
Supersedes #42979 (elastic/elasticsearch#42979)

Port of elastic/elasticsearch#43202
mergify bot pushed a commit to crate/crate that referenced this pull request Dec 4, 2019
With this change, we will rebuild the live version map and local checkpoint using
documents (including soft-deleted) from the safe commit when opening an internal
engine. This allows us to safely prune away _id of all soft-deleted documents as
the version map is always in-sync with Lucene index.

Relates #40741 (elastic/elasticsearch#40741)
Supersedes #42979 (elastic/elasticsearch#42979)

Port of elastic/elasticsearch#43202
mergify bot pushed a commit to crate/crate that referenced this pull request Dec 4, 2019
With this change, we will rebuild the live version map and local checkpoint using
documents (including soft-deleted) from the safe commit when opening an internal
engine. This allows us to safely prune away _id of all soft-deleted documents as
the version map is always in-sync with Lucene index.

Relates #40741 (elastic/elasticsearch#40741)
Supersedes #42979 (elastic/elasticsearch#42979)

Port of elastic/elasticsearch#43202

(cherry picked from commit bb23cb7)

# Conflicts:
#	es/es-server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
#	es/es-server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
marregui added a commit to crate/crate that referenced this pull request Dec 5, 2019
With this change, we will rebuild the live version map and local checkpoint using
documents (including soft-deleted) from the safe commit when opening an internal
engine. This allows us to safely prune away _id of all soft-deleted documents as
the version map is always in-sync with Lucene index.

Relates #40741 (elastic/elasticsearch#40741)
Supersedes #42979 (elastic/elasticsearch#42979)

Port of elastic/elasticsearch#43202

(cherry picked from commit bb23cb7)
mergify bot pushed a commit to crate/crate that referenced this pull request Dec 5, 2019
With this change, we will rebuild the live version map and local checkpoint using
documents (including soft-deleted) from the safe commit when opening an internal
engine. This allows us to safely prune away _id of all soft-deleted documents as
the version map is always in-sync with Lucene index.

Relates #40741 (elastic/elasticsearch#40741)
Supersedes #42979 (elastic/elasticsearch#42979)

Port of elastic/elasticsearch#43202

(cherry picked from commit bb23cb7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v7.3.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants