-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rebuild version map when opening internal engine #43202
Conversation
Pinging @elastic/es-distributed |
@@ -5856,4 +5888,37 @@ public void testPruneAwayDeletedButRetainedIds() throws Exception { | |||
} | |||
} | |||
} | |||
|
|||
public void testRecoverFromLocalTranslog() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test fails on master and 7.x (1 of 5000) without this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool can we try to make a test that fails more often?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test fails when we have out of order in the safe commit and a merge prunes _id. This test index/delete/flush/merge randomly then restarting an engine should yield the same documents as before. I can make it fail more happen but I prefer to leave as it is (hope we can catch other situations). Are you okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there other tests that are failing more often than 1 in 5000, to make sure that this change has enough coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can add that test in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice change I left some comments
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
this.lastRefreshedCheckpointListener = new LastRefreshedCheckpointListener(localCheckpointTracker.getCheckpoint()); | ||
this.internalSearcherManager.addListener(lastRefreshedCheckpointListener); | ||
maxSeqNoOfUpdatesOrDeletes = new AtomicLong(SequenceNumbers.max(localCheckpointTracker.getMaxSeqNo(), translog.getMaxSeqNo())); | ||
if (softDeleteEnabled && localCheckpointTracker.getCheckpoint() < localCheckpointTracker.getMaxSeqNo()) { | ||
try (Searcher searcher = acquireSearcher("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL)) { | ||
restoreVersionMapAndCheckpointTracker(Lucene.wrapAllDocsLive(searcher.getDirectoryReader())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we actually call flush after that to make sure we commit and clear the version map again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we actually skip this entire dance if we don't have any deletes in the index? just curious...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we actually skip this entire dance if we don't have any deletes in the index? just curious...
Yes, we can. If there's no delete, the version map will empty after this method. For the local checkpoint, I think we would like to have it in sync Lucene index. Yannick, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we actually skip this entire dance if we don't have any deletes in the index? just curious...
I think we could even skip it if there are no soft-deletes in the range we're looking for (i.e. from local checkpoint of safe commit upwards). I would like to avoid special-casing this too much though, and think we should just always do the full thing here. This should be very fast, given that this is a subset of the range that we will have to replay from the translog, which will certainly take much longer.
@@ -5856,4 +5888,37 @@ public void testPruneAwayDeletedButRetainedIds() throws Exception { | |||
} | |||
} | |||
} | |||
|
|||
public void testRecoverFromLocalTranslog() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool can we try to make a test that fails more often?
server/src/main/java/org/elasticsearch/index/engine/EngineConfig.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I've left some comments. Did not look at the tests yet.
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
} | ||
} | ||
// remove live entries in the version map | ||
refresh("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this just calling maybePruneDeletes
, which seems to only run if X amount of time has passed after the last pruning? Isn't this a problem in general if we want to mainly do pruning based on local checkpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can manually call versionMap.pruneTombstones(Long.MAX_VALUE, getLocalCheckpoint());
after the refresh. However, I expect it won't prune anything since all delete tombstones should be above the local checkpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The local checkpoint in the commit is a conservative approximation of the the true local checkpoint, but you're right, this should not lead to much pruning here. I'm generally concerned with the maybePruneDeletes
method, which only seems to care about timestamps, and not local checkpoint advancement. Not something we need to solve in this PR, just a more general observation.
this.lastRefreshedCheckpointListener = new LastRefreshedCheckpointListener(localCheckpointTracker.getCheckpoint()); | ||
this.internalSearcherManager.addListener(lastRefreshedCheckpointListener); | ||
maxSeqNoOfUpdatesOrDeletes = new AtomicLong(SequenceNumbers.max(localCheckpointTracker.getMaxSeqNo(), translog.getMaxSeqNo())); | ||
if (softDeleteEnabled && localCheckpointTracker.getCheckpoint() < localCheckpointTracker.getMaxSeqNo()) { | ||
try (Searcher searcher = acquireSearcher("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL)) { | ||
restoreVersionMapAndCheckpointTracker(Lucene.wrapAllDocsLive(searcher.getDirectoryReader())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we actually skip this entire dance if we don't have any deletes in the index? just curious...
I think we could even skip it if there are no soft-deletes in the range we're looking for (i.e. from local checkpoint of safe commit upwards). I would like to avoid special-casing this too much though, and think we should just always do the full thing here. This should be very fast, given that this is a subset of the range that we will have to replay from the translog, which will certainly take much longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Outdated
Show resolved
Hide resolved
} | ||
} | ||
// remove live entries in the version map | ||
refresh("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The local checkpoint in the commit is a conservative approximation of the the true local checkpoint, but you're right, this should not lead to much pruning here. I'm generally concerned with the maybePruneDeletes
method, which only seems to care about timestamps, and not local checkpoint advancement. Not something we need to solve in this PR, just a more general observation.
server/src/main/java/org/elasticsearch/index/fieldvisitor/IdOnlyFieldVisitor.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
public String getId() { | ||
return id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this throw an exception if visited == false
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
visited
can be false for Noops which do not have _id.
@@ -5856,4 +5888,37 @@ public void testPruneAwayDeletedButRetainedIds() throws Exception { | |||
} | |||
} | |||
} | |||
|
|||
public void testRecoverFromLocalTranslog() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there other tests that are failing more often than 1 in 5000, to make sure that this change has enough coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I left one optional comment.
} | ||
} | ||
// remove live entries in the version map | ||
refresh("restore_version_map_and_checkpoint_tracker", SearcherScope.INTERNAL, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this runs in the constructor, I wonder if it was better to simply clear the live version maps rather than call refresh
here? Calling refresh
calls listeners and we thus call those while the engine is half constructed (especially when the actual engine is the FollowingEngine
). I found no cases where this would be problematic currently, so will leave it up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I have prototyped a change for this but I think it requires a full review. I will work on a follow-up.
@s1monw @ywelsch @henningandersen Thanks a lot for your great feedback and proposing this idea. |
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with the Lucene index. Relates #40741 Supersedes #42979
This PR reverts #35230. Previously, we reply on soft-deletes to fill the mismatch between the version map and the Lucene index. This is no longer needed after #43202 where we rebuild the version map when opening an engine. Moreover, PrunePostingsMergePolicy can prune _id of soft-deleted documents out of order; thus the lookup result including soft-deletes sometimes does not return the latest version (although it's okay as we only use a valid result in an engine). With this change, we use only live documents in Lucene to resolve the indexing strategy. This is perfectly safe since we keep all deleted documents after the local checkpoint in the version map. Closes #42979
This PR reverts #35230. Previously, we reply on soft-deletes to fill the mismatch between the version map and the Lucene index. This is no longer needed after #43202 where we rebuild the version map when opening an engine. Moreover, PrunePostingsMergePolicy can prune _id of soft-deleted documents out of order; thus the lookup result including soft-deletes sometimes does not return the latest version (although it's okay as we only use a valid result in an engine). With this change, we use only live documents in Lucene to resolve the indexing strategy. This is perfectly safe since we keep all deleted documents after the local checkpoint in the version map. Closes #42979
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index. Relates #40741 (elastic/elasticsearch#40741) Supersedes #42979 (elastic/elasticsearch#42979) Port of elastic/elasticsearch#43202
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index. Relates #40741 (elastic/elasticsearch#40741) Supersedes #42979 (elastic/elasticsearch#42979) Port of elastic/elasticsearch#43202
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index. Relates #40741 (elastic/elasticsearch#40741) Supersedes #42979 (elastic/elasticsearch#42979) Port of elastic/elasticsearch#43202
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index. Relates #40741 (elastic/elasticsearch#40741) Supersedes #42979 (elastic/elasticsearch#42979) Port of elastic/elasticsearch#43202
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index. Relates #40741 (elastic/elasticsearch#40741) Supersedes #42979 (elastic/elasticsearch#42979) Port of elastic/elasticsearch#43202 (cherry picked from commit bb23cb7) # Conflicts: # es/es-server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java # es/es-server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index. Relates #40741 (elastic/elasticsearch#40741) Supersedes #42979 (elastic/elasticsearch#42979) Port of elastic/elasticsearch#43202 (cherry picked from commit bb23cb7)
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index. Relates #40741 (elastic/elasticsearch#40741) Supersedes #42979 (elastic/elasticsearch#42979) Port of elastic/elasticsearch#43202 (cherry picked from commit bb23cb7)
With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with Lucene index.
Relates #40741
Supersedes #42979
To minimize this PR, I will work on two follow-ups:
readHistoryOperations
estimateNumberOfHistoryOperations
since we have it in EngineConfig now.