You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that if your index contains more than 10000 documents and you need an exact count, you need to include ”track_total_hits”: true as shown below (note that depending on your index size, this can be costly)
ES: "Do not compute hit counts by default #33028" elastic/elasticsearch#33028
is a VOLUMINOUS thread, starting with:
Lucene 8 introduces optimizations that allow to compute top hits more efficiently by skipping documents that do not produce competitive scores. We would like to enable this behavior by default so that users can opt in if they need accurate total hit counts, which are costly, rather than the other way around.
Kibana: "ES will eventually disable hit counts - affects APM UI #25862" elastic/kibana#25862
Ugh. That's ridiculous (from a database user perspective). This is critical data we need to show users on every search result. Could we:
a. sum the attention-over-time data and use that to provide an estimated total? (Django front-end server could do this, or mediacloud-news-search library used by mc-providers)
b. turn on the expensive solution in staging and measure true impact on our data?
It seems "total hits" on a search is capped at 10K.
It seems like there was quite a cluster-truck around this topic, and there was at least one breaking change in the ES API regarding this.
https://opster.com/guides/elasticsearch/search-apis/elasticsearch-count-query/
says:
Current API documentation on
track_total_hits
(it's anint
; no, it's abool
, NO! IT'S BOTH!!):https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#track-total-hits
A number of related github issues:
ES: "Do not compute hit counts by default #33028" elastic/elasticsearch#33028
is a VOLUMINOUS thread, starting with:
Kibana: "ES will eventually disable hit counts - affects APM UI #25862" elastic/kibana#25862
And ES PRs:
"Add rest_total_hits_as_int in the search APIs #35848" (7.x branch?) elastic/elasticsearch#35848
"Make hits.total an object in the search response #35849 (6.x branch?) " elastic/elasticsearch#35849
P.S.
The IA web_collection_search Dockerfile https://github.com/internetarchive/web_collection_search/blob/main/Dockerfile is wired to use >=7.0,<8.0 client code; I wonder if it's related?
The text was updated successfully, but these errors were encountered: