-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add inner hits to nested and parent/child queries #8153
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,246 @@ | ||
[[search-request-inner-hits]] | ||
=== Inner hits | ||
|
||
The <<mapping-parent-field, parent/child>> and <<mapping-nested-type, nested>> features allow to return documents that | ||
have matches in a different scope. In the parent/child case parent document are returned based on matches in child | ||
documents or child document are returned based on matches in parent documents. In the nested case documents are returned | ||
based on matches in nested inner objects. | ||
|
||
In both cases the actual matches in the different scopes that caused a document to be returned is hidden. In many cases | ||
it is very useful to know which inner nested objects in the case of nested or children or parent documents in the case | ||
of parent/child caused certain information to be returned. The inner hits feature can be used for this. This feature | ||
returns per search hit in the search response additional nested hits that caused a search hit to match in a different scope. | ||
|
||
The following snippet explains the basic structure of inner hits: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
"inner_hits" : { | ||
"<inner_hits_name>" : { | ||
"<path|type>" : { | ||
"<path-to-nested-object-field|child-or-parent-type>" : { | ||
<inner_hits_body> | ||
[,"inner_hits" : { [<sub_inner_hits>]+ } ]? | ||
} | ||
} | ||
} | ||
[,"<inner_hits_name_2>" : { ... } ]* | ||
} | ||
-------------------------------------------------- | ||
|
||
Inside the `inner_hits` definition, first the name if the inner hit is defined then whether the inner_hit | ||
is a nested by defining `path` or a parent/child based definition by defining `type`. The next object layer contains | ||
the name of the nested object field if the inner_hits is nested or the parent or child type if the inner_hit definition | ||
is parent/child based. | ||
|
||
Multiple inner hit definitions can be defined in a single request. In the `<inner_hits_body>` any option for features | ||
that `inner_hits` support can be defined. Optionally another `inner_hits` definition can be defined in the `<inner_hits_body>`. | ||
|
||
If `inner_hits` is defined each search will contain a `inner_hits` json object with the following structure: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
"hits": [ | ||
{ | ||
"_index": ..., | ||
"_type": ..., | ||
"_id": ..., | ||
"inner_hits": { | ||
"<inner_hits_name>": { | ||
"hits": { | ||
"total": ..., | ||
"hits": [ | ||
{ | ||
"_type": ..., | ||
"_id": ..., | ||
... | ||
}, | ||
... | ||
] | ||
} | ||
} | ||
}, | ||
... | ||
}, | ||
... | ||
] | ||
-------------------------------------------------- | ||
|
||
==== Options | ||
|
||
Inner hits support the following options: | ||
|
||
[horizontal] | ||
`path`:: Defines the nested scope where hits will be collected from. | ||
`type`:: Defines the parent or child type score where hits will be collected from. | ||
`query`:: Defines the query that will run in the defined nested, parent or child scope to collect and score hits. By default all document in the scope will be matched. | ||
`from`:: The offset from where the first hit to fetch for each `inner_hits` in the returned regular search hits. | ||
`size`:: The maximum number of hits to return per `inner_hits`. By default the top three matching hits are returned. | ||
`sort`:: How the inner hits should be sorted per `inner_hits`. By default the hits are sorted by the score. | ||
|
||
Either `path` or `type` must be defined. The `path` or `type` defines the scope from where hits are fetched and | ||
used as inner hits. | ||
|
||
Inner hits also supports the following per document features: | ||
|
||
* <<search-request-highlighting,Highlighting>> | ||
* <<search-request-explain,Explain>> | ||
* <<search-request-source-filtering,Source filtering>> | ||
* <<search-request-script-fields,Script fields>> | ||
* <<search-request-fielddata-fields,Fielddata fields>> | ||
* <<search-request-version,Include versions>> | ||
|
||
[[nested-inner-hits]] | ||
==== Nested inner hits | ||
|
||
The nested `inner_hits` can be used to include nested inner objects as inner hits to a search hit. | ||
|
||
The example below assumes that there is a nested object field defined with the name `comments`: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
{ | ||
"query" : { | ||
"nested" : { | ||
"path" : "comments", | ||
"query" : { | ||
"match" : {"comments.message" : "[actual query]"} | ||
} | ||
} | ||
}, | ||
"inner_hits" : { | ||
"comment" : { | ||
"path" : { <1> | ||
"comments" : { <2> | ||
"query" : { | ||
"match" : {"comments.message" : "[actual query]"} | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
|
||
<1> The inner hit definition is nested and requires the `path` option. | ||
<2> The path option refers to the nested object field `comments` | ||
|
||
In the above the query is repeated in both the query and the `comment` inner hit definition. At the moment there is | ||
no query referencing support, so in order to make sure that only inner nested objects are returned that contributed to | ||
the matching of the regular hits, the inner query in the `nested` query needs to also be defined on the inner hits definition. | ||
|
||
An example of a response snippet that could be generated from the above search request: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
... | ||
"hits": { | ||
... | ||
"hits": [ | ||
{ | ||
"_index": "my-index", | ||
"_type": "question", | ||
"_id": "1", | ||
"_source": ..., | ||
"inner_hits": { | ||
"comment": { | ||
"hits": { | ||
"total": ..., | ||
"hits": [ | ||
{ | ||
"_type": "question", | ||
"_id": "1", | ||
"_nested": { | ||
"field": "comments", | ||
"offset": 2 | ||
}, | ||
"_source": ... | ||
}, | ||
... | ||
] | ||
} | ||
} | ||
} | ||
}, | ||
... | ||
-------------------------------------------------- | ||
|
||
The `_nested` metadata is crucial in the above example, because it defines from what inner nested object this inner hit | ||
came from. The `field` defines the object array field the nested hit is from and the `offset` relative to its location | ||
in the `_source`. Due to sorting and scoring the actual location of the hit objects in the `inner_hits` is usually | ||
different than the location a nested inner object was defined. | ||
|
||
By default the `_source` is returned also for the hit objects in `inner_hits`, but this can be changed. Either via | ||
`_source` filtering feature part of the source can be returned or be disabled. If stored fields are defined on the | ||
nested level these can also be returned via the `fields` feature. | ||
|
||
An important default is that the `_source` returned in hits inside `inner_hits` is relative to the `_nested` metadata. | ||
So in the above example only the comment part is returned per nested hit and not the entire source of the top level | ||
document that contained the the comment. | ||
|
||
[[parent-child-inner-hits]] | ||
==== Parent/child inner hits | ||
|
||
The parent/child `inner_hits` can be used to include parent or child | ||
|
||
The examples below assumes that there is a `_parent` field mapping in the `comment` type: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
{ | ||
"query" : { | ||
"has_child" : { | ||
"type" : "comment", | ||
"query" : { | ||
"match" : {"message" : "[actual query]"} | ||
} | ||
} | ||
}, | ||
"inner_hits" : { | ||
"comment" : { | ||
"type" : { <1> | ||
"comment" : { <2> | ||
"query" : { | ||
"match" : {"message" : "[actual query]"} | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
|
||
<1> This is a parent/child inner hit definition and requires the `type` option. | ||
<2> Refers to the document type `comment` | ||
|
||
An example of a response snippet that could be generated from the above search request: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
... | ||
"hits": { | ||
... | ||
"hits": [ | ||
{ | ||
"_index": "my-index", | ||
"_type": "question", | ||
"_id": "1", | ||
"_source": ..., | ||
"inner_hits": { | ||
"comment": { | ||
"hits": { | ||
"total": ..., | ||
"hits": [ | ||
{ | ||
"_type": "comment", | ||
"_id": "5", | ||
"_source": ... | ||
}, | ||
... | ||
] | ||
} | ||
} | ||
} | ||
}, | ||
... | ||
-------------------------------------------------- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -165,11 +165,13 @@ public Query parse(QueryParseContext parseContext) throws IOException, QueryPars | |
} | ||
} | ||
|
||
static ThreadLocal<LateBindingParentFilter> parentFilterContext = new ThreadLocal<>(); | ||
// TODO: Change this mechanism in favour of how parent nested object type is resolved in nested and reverse_nested agg | ||
// with this also proper validation can be performed on what is a valid nested child nested object type to be used | ||
public static ThreadLocal<LateBindingParentFilter> parentFilterContext = new ThreadLocal<>(); | ||
|
||
static class LateBindingParentFilter extends BitDocIdSetFilter { | ||
public static class LateBindingParentFilter extends BitDocIdSetFilter { | ||
|
||
BitDocIdSetFilter filter; | ||
public BitDocIdSetFilter filter; | ||
|
||
@Override | ||
public int hashCode() { | ||
|
@@ -178,7 +180,8 @@ public int hashCode() { | |
|
||
@Override | ||
public boolean equals(Object obj) { | ||
return filter.equals(obj); | ||
if (!(obj instanceof LateBindingParentFilter)) return false; | ||
return filter.equals(((LateBindingParentFilter) obj).filter); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks for fixing this |
||
} | ||
|
||
@Override | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for adding this TODO