-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail start of non-data node if node has data #37347
Fail start of non-data node if node has data #37347
Conversation
Check that nodes started with node.data=false cannot start if they have shard data to avoid (old) indexes being resurrected into the cluster in red status. Issue elastic#27073
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up @henningandersen. I've left some comments.
server/src/main/java/org/elasticsearch/env/NodeEnvironment.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/env/NodeEnvironment.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/env/NodeEnvironment.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/env/NodeEnvironment.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/env/NodeEnvironment.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/env/NodeEnvironmentIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/env/NodeEnvironmentIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/env/NodeEnvironmentIT.java
Outdated
Show resolved
Hide resolved
}); | ||
fail("Starting a node with node.data=false that has index/shard data must fail"); | ||
} catch (IllegalArgumentException e) { | ||
// EXPECTED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also assert something on the exception message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up only asserting that the message contains the index UUID, would be nice to assert the entire path if possible, but I will need help if you think that is worth pursuing?
throw new IllegalArgumentException("Node is started with " | ||
+ Node.NODE_DATA_SETTING.getKey() | ||
+ "=false, but found shard data: " | ||
+ badDataPaths); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the absolute path that we're outputting here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be the relative path if path.node is relative. Do you prefer the absolute path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to absolute path.
Improved message and exception type plus cleaner tests.
Thanks for your comments @ywelsch, I have addressed mostt of them in the latest commit and added comments to the rest. Will you please take another look? |
@elasticmachine run gradle build tests 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few more nits, but looks good o.w.
The reason I explicitly build settings here is to be able to reuse the path.node setting for the next newNodeEnvironment call. Is there a better way?
I missed that one. Thanks for clarifying. Can you perhaps add a comment where you create the second NodeEnvironment to make this more explicit? I don't see a better way btw.
|
||
final String node_1 = internalCluster().startNodes(1).get(0); | ||
logger.info("--> starting one node"); | ||
internalCluster().startNodes(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just startNode()
final String indexUUID = resolveIndex(indexName).getUUID(); | ||
|
||
logger.info("--> indexing a simple document"); | ||
client().prepareIndex(indexName, "type1", "1").setSource("field1", "value1").setRefreshPolicy(IMMEDIATE).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to set .setRefreshPolicy(IMMEDIATE)
. The indexed changes are guaranteed to be made durable before the index request returns here. Visibility for searches does not matter as we don't search here.
return Settings.builder().put(Node.NODE_DATA_SETTING.getKey(), false).build(); | ||
} | ||
})); | ||
assertThat(ex.getMessage(), containsString(indexUUID)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert that message starts with "Node is started with " + Node.NODE_DATA_SETTING.getKey() + "=false, but has shard data"
nodeAndClient.startNode(); | ||
success = true; | ||
} finally { | ||
if (!success) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
success == false
() -> newNodeEnvironment(noDataSettings).close()); | ||
|
||
assertThat(ex.getMessage(), | ||
containsString(env.indexPaths(index)[0].resolve(shardDir).toAbsolutePath().toString())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, assert that message starts with "Node is started with " + Node.NODE_DATA_SETTING.getKey() + "=false, but has shard data"
Files.createDirectories(path.resolve(MetaDataStateFormat.STATE_DIR_NAME)); | ||
} | ||
|
||
env.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that you can also use try-with-resources
here, e.g.:
try (NodeEnvironment env = newNodeEnvironment(settings)) {
for (Path path : env.indexPaths(index)) {
Files.createDirectories(path.resolve(MetaDataStateFormat.STATE_DIR_NAME));
}
}
This helps in cleaning up resources even when tests fail, and avoids having secondary failures that might distract when analyzing a test failure.
Avoid impacting subsequent tests on failures and added assert for more of the message.
@elasticmachine run gradle build tests 2 |
@elasticmachine please run the Gradle build tests 1 |
@elasticmachine run elasticsearch-ci-1 |
@elasticmachine run gradle build tests 1 |
4 similar comments
@elasticmachine run gradle build tests 1 |
@elasticmachine run gradle build tests 1 |
@elasticmachine run gradle build tests 1 |
@elasticmachine run gradle build tests 1 |
@elasticmachine run gradle build tests 2 |
…data_when_node_data_false
* elastic/master: (43 commits) Remove remaining occurances of "include_type_name=true" in docs (elastic#37646) SQL: Return Intervals in SQL format for CLI (elastic#37602) Publish to masters first (elastic#37673) Un-assign persistent tasks as nodes exit the cluster (elastic#37656) Fail start of non-data node if node has data (elastic#37347) Use cancel instead of timeout for aborting publications (elastic#37670) Follow stats api should return a 404 when requesting stats for a non existing index (elastic#37220) Remove deprecated FieldNamesFieldMapper.Builder#index (elastic#37305) Document that date math is locale independent Bootstrap a Zen2 cluster once quorum is discovered (elastic#37463) Upgrade to lucene-8.0.0-snapshot-83f9835. (elastic#37668) Mute failing test Fix java time formatters that round up (elastic#37604) Removes awaits fix as the fix is in. (elastic#37676) Mute failing test Ensure that max seq # is equal to the global checkpoint when creating ReadOnlyEngines (elastic#37426) Mute failing discovery disruption tests Add note about how the body is referenced (elastic#33935) Define constants for REST requests endpoints in tests (elastic#37610) Make prepare engine step of recovery source non-blocking (elastic#37573) ...
When a node is repurposed to master/no-data or no-master/no-data, v7.x will not start (see elastic#37748 and elastic#37347). The `elasticsearch repurpose` tool can fix this by cleaning up the problematic data.
Check that nodes started with node.data=false cannot start if they have
shard data to avoid (old) indexes being resurrected into the cluster in red status.
Issue #27073
Only fixes node.data=false part of issue, not coordinating only.
I were not sure if this should go into docs too?