-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avg aggregation over 0 documents/values yields Double.NaN if using transport client, but Double.POSITIVE_INFINITY if using REST client #43820
Comments
Pinging @elastic/es-core-features |
I would like to work on this issue. |
Our general consensus is to replicate what the transport client returns. But I dont know these classes well enough to know what the "correct" answer is here. I will notify some team members to see if they can answer. |
based on #24162, it would probably make sense for @cbuescher to comment on what the value should be, and whether he intended on the difference here. |
@hub-cap does it make sense to use TransportClient as a reference taking into account that
In master branch there is no such class as TransportClient already. |
The problem in replicating the exact NaN/+/-Inf values as in the transport client the we encountered when writing the new REST client classes was that on the client side we can only use what we can parse from the REST response. In case of the Avg agg this e.g. means we no longer know what |
@9deniz you are correct! This effort we did to get the HLRC up to speed was for 7, which is where the transport client code still lives. We tried to reach API parity w/ the transport client so users could migrate and not see weird things. That being said, we can change APIs in 8 to what we want, its just a breaking (potentially) change. But I still think that it is worth reiterating what @cbuescher mentioned which is to know what the difference is between NaN/+-inf values, to know if it matters if we return a different value here in the HLRC. |
Im closing this issue as something we wont fix. Simply knowing that the value is not real in any way should be sufficient, regardless of the NAN/POSITIVE_INFINTY status. |
Describe the feature: Running an
avg
aggregation when 0 documents/values match yields different values depending on connection type (TransportClient vs RestClient).Elasticsearch version (
bin/elasticsearch --version
): 7.2.0 (and 6.x as well)Plugins installed: []
JVM version (
java -version
): 11.0.3OS version (
uname -a
if on a Unix-like system): Linux sascha 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/LinuxDescription of the problem including expected versus actual behavior: Calculation of
avg
metric over 0 documents yieldsNaN
if connected to Elasticsearch using transport client, butDouble.POSITIVE_INFINITY
if using REST client. IMHO the result of an operation should not depend on the connection type to the Elasticsearch cluster.This seems to be related to the fact that internally, Elasticsearch takes two different routes to create the result of an
avg
aggregation:When using the TransportClient, this is done in
InternalAvg
assum / count
(https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalAvg.java#L69), where in case ofcount == 0
Java returnsDouble.NaN
Whereas when using REST, this is done in
ParsedAvg
, explicitly settingDouble.POSITIVE_INFINITY
as the default (https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/search/aggregations/metrics/ParsedAvg.java#L55)Steps to reproduce: (see also attached elastic-avg.zip)
Run Elasticsearch via Docker as
run -it --rm -p 127.0.0.1:9200:9200 -p 127.0.0.1:9300:9300 -e "discovery.type=single-node" -e "cluster.name=elasticsearch" docker.elastic.co/elasticsearch/elasticsearch:7.2.0
Index a sample document:
0
:Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: