-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track total hits up to 10,000 by default #37466
Conversation
This commit changes the default for the `track_total_hits` option of the search request to `10,000`. This means that by default search requests will accurately track the total hit count up to `10,000` documents, requests that match more than this value will set the `"total.relation"` to `"gte"` (e.g. greater than or equals) and the `"total.value"` to `10,000` in the search response. Scroll queries are not impacted, they will continue to count the total hits accurately. The default is set back to `true` (accurate hit count) if `rest_total_hits_as_int` is set in the search request. I choose `10,000` as the default because that's also the number we use to limit pagination. This means that users will be able to know how far they can jump (up to 10,000) even if the total number of hits is not accurate. Closes elastic#33028
Pinging @elastic/es-search |
…re an accurate count internally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good. We will probably need to talk with other teams before merging this change so that they set track_total_hits=true
and can deal with disabling exact counts later?
server/src/main/java/org/elasticsearch/common/io/stream/StreamInput.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/common/io/stream/StreamOutput.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/query/EarlyTerminatingCollector.java
Show resolved
Hide resolved
run the gradle build tests 1 |
@elasticmachine run gradle build tests 1 |
@elasticmachine run elasticsearch-ci/2 |
// no matter what the value of track_total_hits is | ||
return SearchContext.TRACK_TOTAL_HITS_ACCURATE; | ||
} | ||
return request.source() == null ? SearchContext.DEFAULT_TRACK_TOTAL_HITS_UP_TO : request.source().trackTotalHitsUpTo() == null ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the request.source() == null?
check necessary on line 704? Seems like it's redundant given the check on line 700
p.s: I'm a curious engineer looking to learn more about elasticsearch's internals. I'm reading the diffs in my spare time to learn how experts in the field are building real-world distributed systems. Thanks for taking the time to answer my questions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check is on request.scroll
which is different than request.source
so it is necessary.
This commit changes the default for the
track_total_hits
option of the search requestto
10,000
. This means that by default search requests will accurately track the total hit countup to
10,000
documents, requests that match more than this value will set the"total.relation"
to
"gte"
(e.g. greater than or equals) and the"total.value"
to10,000
in the search response.Scroll queries are not impacted, they will continue to count the total hits accurately.
The default is set back to
true
(accurate hit count) ifrest_total_hits_as_int
is set in the search request.I choose
10,000
as the default because that's also the number we use to limit pagination. This means thatusers will be able to know how far they can jump (up to 10,000) even if the total number of hits is not accurate.
Closes #33028