-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge mappings for composable index templates #58521
Merge mappings for composable index templates #58521
Conversation
c30d286
to
6f5e142
Compare
The differences from standard mapping merge: * For metadata fields like _source, the whole field definition is replaced instead of merged. * Dynamic templates are appended to the end of the existing list. Any duplicate definitions overwrite existing ones. * The _meta object is deep-merged, so any existing keys are replaced.
6f5e142
to
5a37903
Compare
Pinging @elastic/es-core-features (:Core/Features/Indices APIs) |
Pinging @elastic/es-search (:Search/Mapping) |
server/src/main/java/org/elasticsearch/index/mapper/RootObjectMapper.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this Julie! I left a couple of comments with (small!) substance and then a bunch of minor nits. I am not an expert in our mapping code so it would be good to have a second opinion also
server/src/main/java/org/elasticsearch/cluster/metadata/DataStream.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/DataStream.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/MapperService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/job/JobManager.java
Outdated
Show resolved
Hide resolved
|
||
if (request.dataStreamName() != null) { | ||
DataStream dataStream = currentState.metadata().dataStreams().get(request.dataStreamName()); | ||
if (dataStream != null) { | ||
dataStream.getTimeStampField().insertTimestampFieldMapping(mappings); | ||
mappings.add(dataStream.getTimeStampField().getTimestampFieldMapping()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After chatting with the core-features team, I now realize that we should apply request.mappings()
after this datastream mapping. I'll fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually @dakrone is that right? If we applied request.mappings()
after the datastream timestamp mapping, it could result in an index with an inconsistent timestamp mapping. So perhaps we always need to apply the datastream timestamp mapping last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the request.mappings()
part last, but we need to validate that the composite mappings are valid for a data stream, that way a user could override part of the mappings (say for instance a custom format for the date), but couldn't do anything bad (like change it to a keyword
for instance)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to keep it as-is for this PR, since it just tries to replicate the existing behavior. We can discuss in a follow-up if we want to change the order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes looks great. I like how it is integrated into the mapping merge, it makes the logic very easy to follow. I left one question and it would be nice to have a single place where the full merging logic is described.
} | ||
} else if (isEnabled() != mergeWith.isEnabled()) { | ||
throw new MapperException("the [enabled] parameter can't be updated for the object mapping [" + name() + "]"); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should nullify the underlying mappers if enabled
is false
? It's not related to this pr specifically but we don't need to keep children when the parent is disabled ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned, I think this is general question. Even without this PR you can define an object mapper like this:
"object": {
"type": "object",
"enabled": false,
"properties": {
"field": {
"type": "keyword"
}
}
}
For this PR I plan to stick with the existing behavior, but we could have a separate discussion on whether we want to remove this flexibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I opened #58702 to discuss this specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Julie!
@elasticmachine elasticsearch-ci/1 |
These were added in elastic#58521 but were incorrect -- at this point the mapping is not necessarily wrapped in the '_doc' type. The assertion likely didn't trip before because it's common for mappings to have only a top-level 'properties' key.
These were added in #58521 but were incorrect -- at this point the mapping is not necessarily wrapped in the '_doc' type. The assertion likely didn't trip before because it's common for mappings to have only a top-level 'properties' key.
This commit adds an integration test that component templates used to form a composite template can still be updated after a cluster restart. In elastic#58643 an issue arose where mappings were causing problems because of the way we unwrap `_doc` in template mappings. This was also related to the mappings being merged manually rather than using the `MapperService` to do the merging. elastic#58643 was fixed in 7.9 and master with the elastic#58521 change, since mappings now are read and digested by the actual mapper service. This test passes for 7.x and master, and I intend to open a separate PR including this test for 7.8.1 along with a bug fix for elastic#58643. This test is to ensure we don't have any regression in the future.
This commit adds an integration test that component templates used to form a composite template can still be updated after a cluster restart. In #58643 an issue arose where mappings were causing problems because of the way we unwrap `_doc` in template mappings. This was also related to the mappings being merged manually rather than using the `MapperService` to do the merging. #58643 was fixed in 7.9 and master with the #58521 change, since mappings now are read and digested by the actual mapper service. This test passes for 7.x and master, and I intend to open a separate PR including this test for 7.8.1 along with a bug fix for #58643. This test is to ensure we don't have any regression in the future.
…ic#58883) This commit adds an integration test that component templates used to form a composite template can still be updated after a cluster restart. In elastic#58643 an issue arose where mappings were causing problems because of the way we unwrap `_doc` in template mappings. This was also related to the mappings being merged manually rather than using the `MapperService` to do the merging. elastic#58643 was fixed in 7.9 and master with the elastic#58521 change, since mappings now are read and digested by the actual mapper service. This test passes for 7.x and master, and I intend to open a separate PR including this test for 7.8.1 along with a bug fix for elastic#58643. This test is to ensure we don't have any regression in the future.
…58883) (#58914) This commit adds an integration test that component templates used to form a composite template can still be updated after a cluster restart. In #58643 an issue arose where mappings were causing problems because of the way we unwrap `_doc` in template mappings. This was also related to the mappings being merged manually rather than using the `MapperService` to do the merging. #58643 was fixed in 7.9 and master with the #58521 change, since mappings now are read and digested by the actual mapper service. This test passes for 7.x and master, and I intend to open a separate PR including this test for 7.8.1 along with a bug fix for #58643. This test is to ensure we don't have any regression in the future.
Due to the way that our `toXContent` of mappings works, we reduce the type to remove `_doc` from template mappings when serializing them to disk as part of cluster state. This causes problems however, as we always expected there to be a `_doc` field in the mappings. This was resolved in later versions of ES (7.9+) by elastic#58521 which uses a real mapper to handle the mappings. For 7.8.1, however, we need to ensure that we can still read the state after it has been persisted to disk. Resolves elastic#58643 Relates to elastic#58883 Relates to elastic#58521
Due to the way that our `toXContent` of mappings works, we reduce the type to remove `_doc` from template mappings when serializing them to disk as part of cluster state. This causes problems however, as we always expected there to be a `_doc` field in the mappings. This was resolved in later versions of ES (7.9+) by #58521 which uses a real mapper to handle the mappings. For 7.8.1, however, we need to ensure that we can still read the state after it has been persisted to disk. Resolves #58643 Relates to #58883 Relates to #58521
This PR implements recursive mapping merging for composable index templates.
When creating an index, we perform the following:
last.
Some principles:
object mapper can never be changed between
type: object
andtype: nested
. Afield mapper can never be changed to an object mapper, and vice versa.
object
mappings, as well as root options like
dynamic_templates
andmeta
. Once wereach 'leaf components' like field definitions, they always overwrite an
existing one instead of being merged.
The PR is pretty large, but each commit can be reviewed individually.
Relates to #53101.