Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inner hits to nested and parent/child queries #8153

Merged
merged 1 commit into from
Dec 2, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/reference/search/request-body.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,5 @@ include::request/index-boost.asciidoc[]
include::request/min-score.asciidoc[]

include::request/named-queries-and-filters.asciidoc[]

include::request/inner-hits.asciidoc[]
246 changes: 246 additions & 0 deletions docs/reference/search/request/inner-hits.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
[[search-request-inner-hits]]
=== Inner hits

The <<mapping-parent-field, parent/child>> and <<mapping-nested-type, nested>> features allow to return documents that
have matches in a different scope. In the parent/child case parent document are returned based on matches in child
documents or child document are returned based on matches in parent documents. In the nested case documents are returned
based on matches in nested inner objects.

In both cases the actual matches in the different scopes that caused a document to be returned is hidden. In many cases
it is very useful to know which inner nested objects in the case of nested or children or parent documents in the case
of parent/child caused certain information to be returned. The inner hits feature can be used for this. This feature
returns per search hit in the search response additional nested hits that caused a search hit to match in a different scope.

The following snippet explains the basic structure of inner hits:

[source,js]
--------------------------------------------------
"inner_hits" : {
"<inner_hits_name>" : {
"<path|type>" : {
"<path-to-nested-object-field|child-or-parent-type>" : {
<inner_hits_body>
[,"inner_hits" : { [<sub_inner_hits>]+ } ]?
}
}
}
[,"<inner_hits_name_2>" : { ... } ]*
}
--------------------------------------------------

Inside the `inner_hits` definition, first the name if the inner hit is defined then whether the inner_hit
is a nested by defining `path` or a parent/child based definition by defining `type`. The next object layer contains
the name of the nested object field if the inner_hits is nested or the parent or child type if the inner_hit definition
is parent/child based.

Multiple inner hit definitions can be defined in a single request. In the `<inner_hits_body>` any option for features
that `inner_hits` support can be defined. Optionally another `inner_hits` definition can be defined in the `<inner_hits_body>`.

If `inner_hits` is defined each search will contain a `inner_hits` json object with the following structure:

[source,js]
--------------------------------------------------
"hits": [
{
"_index": ...,
"_type": ...,
"_id": ...,
"inner_hits": {
"<inner_hits_name>": {
"hits": {
"total": ...,
"hits": [
{
"_type": ...,
"_id": ...,
...
},
...
]
}
}
},
...
},
...
]
--------------------------------------------------

==== Options

Inner hits support the following options:

[horizontal]
`path`:: Defines the nested scope where hits will be collected from.
`type`:: Defines the parent or child type score where hits will be collected from.
`query`:: Defines the query that will run in the defined nested, parent or child scope to collect and score hits. By default all document in the scope will be matched.
`from`:: The offset from where the first hit to fetch for each `inner_hits` in the returned regular search hits.
`size`:: The maximum number of hits to return per `inner_hits`. By default the top three matching hits are returned.
`sort`:: How the inner hits should be sorted per `inner_hits`. By default the hits are sorted by the score.

Either `path` or `type` must be defined. The `path` or `type` defines the scope from where hits are fetched and
used as inner hits.

Inner hits also supports the following per document features:

* <<search-request-highlighting,Highlighting>>
* <<search-request-explain,Explain>>
* <<search-request-source-filtering,Source filtering>>
* <<search-request-script-fields,Script fields>>
* <<search-request-fielddata-fields,Fielddata fields>>
* <<search-request-version,Include versions>>

[[nested-inner-hits]]
==== Nested inner hits

The nested `inner_hits` can be used to include nested inner objects as inner hits to a search hit.

The example below assumes that there is a nested object field defined with the name `comments`:

[source,js]
--------------------------------------------------
{
"query" : {
"nested" : {
"path" : "comments",
"query" : {
"match" : {"comments.message" : "[actual query]"}
}
}
},
"inner_hits" : {
"comment" : {
"path" : { <1>
"comments" : { <2>
"query" : {
"match" : {"comments.message" : "[actual query]"}
}
}
}
}
}
}
--------------------------------------------------

<1> The inner hit definition is nested and requires the `path` option.
<2> The path option refers to the nested object field `comments`

In the above the query is repeated in both the query and the `comment` inner hit definition. At the moment there is
no query referencing support, so in order to make sure that only inner nested objects are returned that contributed to
the matching of the regular hits, the inner query in the `nested` query needs to also be defined on the inner hits definition.

An example of a response snippet that could be generated from the above search request:

[source,js]
--------------------------------------------------
...
"hits": {
...
"hits": [
{
"_index": "my-index",
"_type": "question",
"_id": "1",
"_source": ...,
"inner_hits": {
"comment": {
"hits": {
"total": ...,
"hits": [
{
"_type": "question",
"_id": "1",
"_nested": {
"field": "comments",
"offset": 2
},
"_source": ...
},
...
]
}
}
}
},
...
--------------------------------------------------

The `_nested` metadata is crucial in the above example, because it defines from what inner nested object this inner hit
came from. The `field` defines the object array field the nested hit is from and the `offset` relative to its location
in the `_source`. Due to sorting and scoring the actual location of the hit objects in the `inner_hits` is usually
different than the location a nested inner object was defined.

By default the `_source` is returned also for the hit objects in `inner_hits`, but this can be changed. Either via
`_source` filtering feature part of the source can be returned or be disabled. If stored fields are defined on the
nested level these can also be returned via the `fields` feature.

An important default is that the `_source` returned in hits inside `inner_hits` is relative to the `_nested` metadata.
So in the above example only the comment part is returned per nested hit and not the entire source of the top level
document that contained the the comment.

[[parent-child-inner-hits]]
==== Parent/child inner hits

The parent/child `inner_hits` can be used to include parent or child

The examples below assumes that there is a `_parent` field mapping in the `comment` type:

[source,js]
--------------------------------------------------
{
"query" : {
"has_child" : {
"type" : "comment",
"query" : {
"match" : {"message" : "[actual query]"}
}
}
},
"inner_hits" : {
"comment" : {
"type" : { <1>
"comment" : { <2>
"query" : {
"match" : {"message" : "[actual query]"}
}
}
}
}
}
}
--------------------------------------------------

<1> This is a parent/child inner hit definition and requires the `type` option.
<2> Refers to the document type `comment`

An example of a response snippet that could be generated from the above search request:

[source,js]
--------------------------------------------------
...
"hits": {
...
"hits": [
{
"_index": "my-index",
"_type": "question",
"_id": "1",
"_source": ...,
"inner_hits": {
"comment": {
"hits": {
"total": ...,
"hits": [
{
"_type": "comment",
"_id": "5",
"_source": ...
},
...
]
}
}
}
},
...
--------------------------------------------------
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
import org.elasticsearch.search.Scroll;
import org.elasticsearch.search.aggregations.AbstractAggregationBuilder;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.innerhits.InnerHitsBuilder;
import org.elasticsearch.search.highlight.HighlightBuilder;
import org.elasticsearch.search.rescore.RescoreBuilder;
import org.elasticsearch.search.sort.SortBuilder;
Expand Down Expand Up @@ -741,6 +742,11 @@ public SearchRequestBuilder setHighlighterExplicitFieldOrder(boolean explicitFie
return this;
}

public SearchRequestBuilder addInnerHit(String name, InnerHitsBuilder.InnerHit innerHit) {
innerHitsBuilder().addInnerHit(name, innerHit);
return this;
}

/**
* Delegates to {@link org.elasticsearch.search.suggest.SuggestBuilder#setText(String)}.
*/
Expand Down Expand Up @@ -1032,6 +1038,10 @@ private HighlightBuilder highlightBuilder() {
return sourceBuilder().highlighter();
}

private InnerHitsBuilder innerHitsBuilder() {
return sourceBuilder().innerHitsBuilder();
}

private SuggestBuilder suggestBuilder() {
return sourceBuilder().suggest();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -165,11 +165,13 @@ public Query parse(QueryParseContext parseContext) throws IOException, QueryPars
}
}

static ThreadLocal<LateBindingParentFilter> parentFilterContext = new ThreadLocal<>();
// TODO: Change this mechanism in favour of how parent nested object type is resolved in nested and reverse_nested agg
// with this also proper validation can be performed on what is a valid nested child nested object type to be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this TODO

public static ThreadLocal<LateBindingParentFilter> parentFilterContext = new ThreadLocal<>();

static class LateBindingParentFilter extends BitDocIdSetFilter {
public static class LateBindingParentFilter extends BitDocIdSetFilter {

BitDocIdSetFilter filter;
public BitDocIdSetFilter filter;

@Override
public int hashCode() {
Expand All @@ -178,7 +180,8 @@ public int hashCode() {

@Override
public boolean equals(Object obj) {
return filter.equals(obj);
if (!(obj instanceof LateBindingParentFilter)) return false;
return filter.equals(((LateBindingParentFilter) obj).filter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for fixing this

}

@Override
Expand Down
11 changes: 11 additions & 0 deletions src/main/java/org/elasticsearch/percolator/PercolateContext.java
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
import org.elasticsearch.search.fetch.FetchSearchResult;
import org.elasticsearch.search.fetch.FetchSubPhase;
import org.elasticsearch.search.fetch.fielddata.FieldDataFieldsContext;
import org.elasticsearch.search.fetch.innerhits.InnerHitsContext;
import org.elasticsearch.search.fetch.script.ScriptFieldsContext;
import org.elasticsearch.search.fetch.source.FetchSourceContext;
import org.elasticsearch.search.highlight.SearchContextHighlight;
Expand Down Expand Up @@ -681,4 +682,14 @@ public SearchContext useSlowScroll(boolean useSlowScroll) {
public Counter timeEstimateCounter() {
throw new UnsupportedOperationException();
}

@Override
public void innerHits(InnerHitsContext innerHitsContext) {
throw new UnsupportedOperationException();
}

@Override
public InnerHitsContext innerHits() {
throw new UnsupportedOperationException();
}
}
5 changes: 5 additions & 0 deletions src/main/java/org/elasticsearch/search/SearchHit.java
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,11 @@ public interface SearchHit extends Streamable, ToXContent, Iterable<SearchHitFie
*/
SearchShardTarget getShard();

/**
* @return Inner hits or <code>null</code> if there are none
*/
Map<String, SearchHits> getInnerHits();

/**
* Encapsulates the nested identity of a hit.
*/
Expand Down
2 changes: 2 additions & 0 deletions src/main/java/org/elasticsearch/search/SearchModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import org.elasticsearch.search.fetch.FetchPhase;
import org.elasticsearch.search.fetch.explain.ExplainFetchSubPhase;
import org.elasticsearch.search.fetch.fielddata.FieldDataFieldsFetchSubPhase;
import org.elasticsearch.search.fetch.innerhits.InnerHitsFetchSubPhase;
import org.elasticsearch.search.fetch.matchedqueries.MatchedQueriesFetchSubPhase;
import org.elasticsearch.search.fetch.script.ScriptFieldsFetchSubPhase;
import org.elasticsearch.search.fetch.source.FetchSourceSubPhase;
Expand Down Expand Up @@ -66,6 +67,7 @@ protected void configure() {
bind(VersionFetchSubPhase.class).asEagerSingleton();
bind(MatchedQueriesFetchSubPhase.class).asEagerSingleton();
bind(HighlightPhase.class).asEagerSingleton();
bind(InnerHitsFetchSubPhase.class).asEagerSingleton();

bind(SearchServiceTransportAction.class).asEagerSingleton();
bind(MoreLikeThisFetchService.class).asEagerSingleton();
Expand Down
Loading