Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add soft-deletes upgrade tests #36286

Merged
merged 3 commits into from
Dec 7, 2018
Merged

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Dec 5, 2018

This change adds rolling-upgrade and full-cluster-restart tests with soft-deletes enabled.

Note that these tests are currently broken because we do not allow update DocValues of old segments (see https://issues.apache.org/jira/browse/LUCENE-8461). I think we need to remove this restriction.

@dnhatn dnhatn added >test Issues or PRs that are addressing/adding tests :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v7.0.0 labels Dec 5, 2018
@dnhatn dnhatn requested review from jpountz, s1monw and bleskes December 5, 2018 23:29
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@dnhatn
Copy link
Member Author

dnhatn commented Dec 5, 2018

Below is the stack trace of the full cluster restart test.

|    [2018-12-06T00:26:32,420][WARN ][o.e.c.r.a.AllocationService] [node-1] failing shard [failed shard, shard [testsoftdeletes][0], node[RgRgfC1OQeCSThD4uugJwg], [P], recovery_source[existing store recovery; bootstrap_history_uuid=false], s[INITIALIZING], a[id=zZzX_JD4SmSW8qUgd2Beqg], unassigned_info[[reason=ALLOCATION_FAILED], at[2018-12-05T23:26:32.330Z], failed_attempts[4], delayed=false, details[failed shard on node [7XyQZMeFSYmCqmCUl62NBQ]: shard failure, reason [lucene commit failed], failure IllegalStateException[This codec should only be used for reading, not writing]], allocation_status[fetching_shard_data]], message [shard failure, reason [lucene commit failed]], failure [IllegalStateException[This codec should only be used for reading, not writing]], markAsStale [true]]
|    java.lang.IllegalStateException: This codec should only be used for reading, not writing
|       at org.apache.lucene.codecs.lucene70.Lucene70Codec$2.getDocValuesFormatForField(Lucene70Codec.java:69) ~[lucene-backward-codecs-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.getInstance(PerFieldDocValuesFormat.java:168) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addNumericField(PerFieldDocValuesFormat.java:109) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.index.ReadersAndUpdates.handleDVUpdates(ReadersAndUpdates.java:368) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:570) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.index.ReaderPool.commit(ReaderPool.java:325) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3328) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3239) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3466) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3431) ~[lucene-core-8.0.0-snapshot-c78429a554.jar:8.0.0-snapshot-c78429a554 c78429a554d28611dacd90c388e6c34039b228d1 - romseygeek - 2018-12-04 10:17:44]
|       at org.elasticsearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2324) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslogInternal(InternalEngine.java:451) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslog(InternalEngine.java:411) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslog(InternalEngine.java:109) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog(IndexShard.java:1348) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:424) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:95) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:302) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:93) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1622) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$6(IndexShard.java:2115) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
|       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
|       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
|       at java.lang.Thread.run(Thread.java:834) [?:?]

@s1monw
Copy link
Contributor

s1monw commented Dec 6, 2018

test LGTM - we need to fix this in lucene updateable DV need this behavior too. @jpountz any objections?

@jpountz
Copy link
Contributor

jpountz commented Dec 6, 2018

Agreed, this needs fixing (Simon and I just had a discussion, I was surprised that this fails in spite of the fact that we have tests that dv updates work on old indices, but we only seem to test pre-existing fields).

@s1monw
Copy link
Contributor

s1monw commented Dec 6, 2018

@dnhatn I fixed the issue in lucene. Can you add a new snapshot build?

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the test passes

dnhatn added a commit that referenced this pull request Dec 7, 2018
Includes:

LUCENE-8594: DV update are broken for updates on new field
LUCENE-8590: Optimize DocValues update datastructures
LUCENE-8593: Specialize single value numeric DV updates

Relates #36286
@dnhatn
Copy link
Member Author

dnhatn commented Dec 7, 2018

Thanks @s1monw and @jpountz.

@dnhatn dnhatn merged commit 968b0b1 into elastic:master Dec 7, 2018
@dnhatn dnhatn deleted the soft-deletes-upgrade-test branch December 7, 2018 08:01
dnhatn added a commit that referenced this pull request Dec 9, 2018
This change adds a rolling-upgrade and full-cluster-restart test with
jasontedor added a commit to liketic/elasticsearch that referenced this pull request Dec 9, 2018
* elastic/6.x: (37 commits)
  [HLRC] Added support for Follow Stats API (elastic#36253)
  Exposed engine must have all ops below gcp during rollback (elastic#36159)
  TEST: Always enable soft-deletes in ShardChangesTests
  Use delCount of SegmentInfos to calculate numDocs (elastic#36323)
  Add soft-deletes upgrade tests (elastic#36286)
  Remove LocalCheckpointTracker#resetCheckpoint (elastic#34667)
  Option to use endpoints starting with _security (elastic#36379)
  [CCR] Restructured QA modules (elastic#36404)
  RestClient: on retry timeout add root exception (elastic#25576)
  [HLRC] Add support for put privileges API (elastic#35679)
  HLRC: Add rollup search (elastic#36334)
  Explicitly recommend to forceMerge before freezing (elastic#36376)
  Rename internal repository actions to be internal (elastic#36377)
  Core: Remove parseDefaulting from DateFormatter (elastic#36386)
  [ML] Prevent stack overflow while copying ML jobs and datafeeds (elastic#36370)
  Docs: Fix Jackson reference (elastic#36366)
  [ILM] Fix issue where index may not yet be in 'hot' phase (elastic#35716)
  Undeprecate /_watcher endpoints (elastic#36269)
  Docs: Fix typo in bool query (elastic#36350)
  HLRC: Add delete template API (elastic#36320)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. >test Issues or PRs that are addressing/adding tests v6.6.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants