-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ILM: add support for rolling over data streams #57295
Conversation
As the datastream information is stored in the `ClusterState.Metadata` we exposed the `Metadata` to the `AsyncWaitStep#evaluateCondition` method in order for the steps to be able to identify when a managed index is part of a DataStream. If a managed index is part of a DataStream the rollover target is the DataStream name and the highest generation index is the write index (ie. the rolled index).
Pinging @elastic/es-core-features (:Core/Features/ILM+SLM) |
DataStream dataStream = indexAbstraction.getParentDataStream().getDataStream(); | ||
rolledIndexName = DataStream.getBackingIndexName(dataStream.getName(), dataStream.getGeneration()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: this correctly identifies the data stream's write index under the current implementation of data streams. The IndexAbstraction::getWriteIndex
method will always return the correct write index in the event that we change the logic around backing index names, generations, etc., and would slightly simplify the code above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, Dan. Pushed a fix to change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments on this, thanks Andrei!
@@ -29,7 +30,7 @@ protected Client getClient() { | |||
return client; | |||
} | |||
|
|||
public abstract void evaluateCondition(IndexMetadata indexMetadata, Listener listener, TimeValue masterTimeout); | |||
public abstract void evaluateCondition(Metadata metadata, IndexMetadata indexMetadata, Listener listener, TimeValue masterTimeout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem strange to pass both the Metadata
and the IndexMetadata
, should we instead pass Metadata
and Index
so it's easy to look up the index metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, fair enough, it felt a bit odd to me too (it equally felt wasteful, in terms of CPU cycles, to re-do the lookup every time though, but given ILM is not so much about low latency I agree it makes sense to change it)
listener.onResponse(true); | ||
return; | ||
} | ||
IndexAbstraction indexAbstraction = currentClusterState.metadata().getIndicesLookup().get(indexMetadata.getIndex().getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super minor, but can you add an
assert indexAbstraction != null : "expected the index " + indexName + " to exist in the lookup but it didn't";
after this line? I don't think it's going to happen, but we should check it regardless. Optionally, we could make it a real error also (throw an IllegalStateException
)
IndexAbstraction indexAbstraction = currentClusterState.metadata().getIndicesLookup().get(indexMetadata.getIndex().getName()); | ||
final String rolloverTarget; | ||
if (indexAbstraction.getParentDataStream() != null) { | ||
rolloverTarget = indexAbstraction.getParentDataStream().getDataStream().getName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be super paranoid, I think we should handle the case where getDataStream()
returns null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that can be changed to:
rolloverTarget = indexAbstraction.getParentDataStream().getName();
to eliminate the need for another null check.
boolean indexingComplete = LifecycleSettings.LIFECYCLE_INDEXING_COMPLETE_SETTING.get(indexMetadata.getSettings()); | ||
if (indexingComplete) { | ||
logger.trace(indexMetadata.getIndex() + " has lifecycle complete set, skipping " + RolloverStep.NAME); | ||
listener.onResponse(true); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we allow this indexing complete setting regardless of whether the parent data stream exists or not? (in otherwords, moving it before the if (indexAbstraction.getParentDataStream() != null) {
check)
} | ||
if (Strings.isNullOrEmpty(rolloverAlias)) { | ||
listener.onFailure(new IllegalArgumentException(String.format(Locale.ROOT, | ||
"setting [%s] for index [%s] is empty or not defined", RolloverAction.LIFECYCLE_ROLLOVER_ALIAS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a chance we can improve this error message (if you agree), maybe something like:
setting [index.lifecycle.rollover_alias] for index [foo-1] is not defined, it must be set to the name of the alias pointing to the group of indices being rolled over
I'm not stuck on the wording, maybe you have a better idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point, I think the wording sounds good
IndexAbstraction indexAbstraction = currentState.metadata().getIndicesLookup().get(index.getName()); | ||
final String rolloverTarget; | ||
if (indexAbstraction.getParentDataStream() != null) { | ||
rolloverTarget = indexAbstraction.getParentDataStream().getDataStream().getName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here about null
check for the getDataStream()
, I also wonder if maybe we should make this a nice static helper like IndexAbstraction.streamNameOrNull(indexAbstraction)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying dataStream
is non-nullable though. The getName
method that Dan recommended emphasis this so replaced it here as well. Happy to discuss if there are any situations where this could be null though (in which case I believe more null checks are needed in IndexAbstraction.DataStream
)
IndexMetadata aliasWriteIndex = aliasAbstraction.getWriteIndex(); | ||
if (aliasWriteIndex != null) { | ||
rolledIndexName = aliasWriteIndex.getIndex().getName(); | ||
waitForActiveShardsSettingValue = aliasWriteIndex.getSettings().get("index.write.wait_for_active_shards"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super minor, but can we use IndexMetadata.SETTING_WAIT_FOR_ACTIVE_SHARDS
instead of hardcoding here?
return getErrorResultOnNullMetadata(index); | ||
} | ||
rolledIndexName = rolledIndexMeta.getIndex().getName(); | ||
waitForActiveShardsSettingValue = rolledIndexMeta.getSettings().get("index.write.wait_for_active_shards"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here about using IndexMetadata.SETTING_WAIT_FOR_ACTIVE_SHARDS
@@ -114,6 +124,16 @@ public Result isConditionMet(Index index, ClusterState clusterState) { | |||
return new Result(enoughShardsActive, new ActiveShardsInfo(currentActiveShards, activeShardCount.toString(), enoughShardsActive)); | |||
} | |||
|
|||
private Result getErrorResultOnNullMetadata(Index originalIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be static
I think
assert indexAbstraction != null : "invalid cluster metadata. index [" + indexMetadata.getIndex().getName() + "] was not found"; | ||
final String rolloverTarget; | ||
if (indexAbstraction.getParentDataStream() != null) { | ||
rolloverTarget = indexAbstraction.getParentDataStream().getDataStream().getName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here about null check or a helper for retrieving the name :)
Dan, Lee, thanks for the review. I believe I've addressed all your suggestions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks, @andreidan!
Just wanted to note that the docs also need to be updated: https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-index-lifecycle-management.html. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Andrei!
@cjcenizal thanks for the heads up, we'll update the docs in an upcoming PR |
As the datastream information is stored in the `ClusterState.Metadata` we exposed the `Metadata` to the `AsyncWaitStep#evaluateCondition` method in order for the steps to be able to identify when a managed index is part of a DataStream. If a managed index is part of a DataStream the rollover target is the DataStream name and the highest generation index is the write index (ie. the rolled index). (cherry picked from commit 6b410df) Signed-off-by: Andrei Dan <[email protected]>
As the datastream information is stored in the `ClusterState.Metadata` we exposed the `Metadata` to the `AsyncWaitStep#evaluateCondition` method in order for the steps to be able to identify when a managed index is part of a DataStream. If a managed index is part of a DataStream the rollover target is the DataStream name and the highest generation index is the write index (ie. the rolled index). (cherry picked from commit 6b410df) Signed-off-by: Andrei Dan <[email protected]>
As the datastream information is stored in the
ClusterState.Metadata
we exposedthe
Metadata
to theAsyncWaitStep#evaluateCondition
method in order forthe steps to be able to identify when a managed index is part of a DataStream.
If a managed index is part of a DataStream the rollover target is the DataStream
name and the highest generation index is the write index (ie. the rolled index).
Relates to #53488
Relates to #53100