-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disallow defining the same field twice in a single mapping call. #37437
Comments
Pinging @elastic/es-search |
This issue has been open for more than a year. Does it still need fixing? |
@zacharymorn yes, I still think it'd be valuable to fix. Let me know if you're interested in looking into it. As a heads up, there are some details that still need settling like whether we need a deprecation period where we warn about the duplicate entries before we outright disallow them. |
@jtibshirani I've created a draft PR for this. For the time being I chose to emit deprecation warning when the issue in question is detected. Please let me know if this looks good to you. |
Thanks @zacharymorn for the contribution. I'll look over the PR. I've also marked this issue for discussion so we can double-check the deprecation strategy. |
I thought we were already checking this here: elasticsearch/server/src/main/java/org/elasticsearch/index/mapper/MapperMergeValidator.java Lines 61 to 62 in c94ebef
|
I looked into this a bit more -- the issue is that when building the root To me this feels like confusing enough behavior, and also uncommon enough that we could treat it as a bug instead of a behavior change. We've seen it come up before in a user's mappings (#51286), but it actively caused issues for that user. Another reason I'm hesitant to treat it as a normal behavior change is that in addition to a deprecation period, we'd want to consider API versioning, which seems too heavy-handed. Maybe a compromise would be to only make the fix in 8.0, and not backport it? For major upgrades, many users already give a close look at their mappings and are ready to make changes. |
I agree with your general thinking, but would like to better understand what we'd be breaking by changing this, e.g. I wonder whether this behavior is relied upon to make dynamic mappings work with a document that would look like this:
|
I did some testing and found that if we changed this behavior, we could keep these related behaviors the same:
A note that at a high level, we rely on this merging behavior for a mapping like the following to work:
The change should make sure to preserve the ability to define mappings like this. For this reason, I think the fix will be a bit more complex. |
Thanks for digging. I agree we should preserve the ability to define mappings like this. |
I'm trying to follow the discussion but feel I might have some context/knowledge gap. From my quick reading of dynamic mapping, it seems to be trigged by index document update. But I was previously under the impression that the fix to this issue is to detect the nested field definition duplication for explicit index creation or mapping updating. So they seems to be on different code path. On the other hand, I’m also not clear that if we disallow nested field definition for explicit mapping update, but keep that for dynamic mapping, wouldn’t that create inconsistencies in field definitions from different methods? Am I missing something? Is my current understanding totally on the wrong direction :D ? |
@zacharymorn sorry for the late response. Your overall understanding is correct, I'll just add some clarifications. This issue is indeed focused on duplicate field definitions in index creation/ mapping updates. As you say, dynamic mappings are triggered by document updates. Dynamic mappings do have their own code path, but they share quite a bit of logic with normal mapping updates. So @jpountz's question was whether changing this behavior for normal mapping updates could also affect dynamic updates.
This is a good question, and one I had after reading the dynamic update example. I think it could also use discussion internally (in addition to the deprecation question). I'll give some preliminary comments on your PR as we work to sort out these open questions. |
We met to discuss and have decided the following:
Some implementation notes:
Lastly for the team's context, here are the two issues I've seen where users ran into this confusing behavior: #51286, #37141. |
@jtibshirani is this issue still open? |
@Razi007 there is already an open PR from @zacharymorn (#54481), I just totally dropped the ball on reviewing it. Since there's already pre-existing work, it'd be best to check out some other issues with the |
Pinging @elastic/es-search (Team:Search) |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
In #37141 we discovered that it is possible to define the same field twice in a single request to add mappings, so long as the field mapping definitions are compatible and can be merged together. For example, the following mapping is valid, and the first definition happens to 'win out':
To me, a request like this is not likely to be intentional, and could indicate a bug in the calling code. For this reason it may make sense to detect this case and disallow it.
As a note, the above mapping update would be rejected if the duplicate field were specified at the same level. For example, the following mapping fails with a JSON parsing exception:
The text was updated successfully, but these errors were encountered: