Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: SUM of multiple 0 values returns NULL #45251

Closed
astefan opened this issue Aug 6, 2019 · 21 comments · Fixed by #65796
Closed

SQL: SUM of multiple 0 values returns NULL #45251

astefan opened this issue Aug 6, 2019 · 21 comments · Fixed by #65796
Labels
:Analytics/Aggregations Aggregations :Analytics/SQL SQL querying >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:QL (Deprecated) Meta label for query languages team

Comments

@astefan
Copy link
Contributor

astefan commented Aug 6, 2019

Given:

  • SELECT SUM(val) FROM test WHERE val IS NOT NULL GROUP BY name
  • all val values are 0 for a certain name value

The result of the above query will return NULL, whereas the result should be 0. This boils down to AggregationInspectionHelper.hasValue(InternalSum) that will return true/false if the sum is 0: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/search/aggregations/support/AggregationInspectionHelper.java#L160
We use these methods to decide if the result of aggregations has a value or not.

Relates to #36020. CC @polyfractal @costin

@astefan astefan added >bug :Analytics/SQL SQL querying labels Aug 6, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

@polyfractal
Copy link
Contributor

polyfractal commented Aug 6, 2019

I think this is only fixable if we start serializing counts (or at least an isEmpty flag) with all the internal aggregation responses. Sum is the worst offender, but other aggs rely on things like positive or negative infinity to determine if it's "empty", which isn't strictly true since the agg could be negative infinity due to the data.

@costin
Copy link
Member

costin commented Aug 6, 2019

indeed this looks like it. The issue was raised since folks in Canvas bumped into this one and it took a while for all involved to figure out what's happening since the WHERE clause should have filtered nulls (it did however the values were all 0) yet the agg/SUM is null (actually it's zero but we treat 0 as being null).
To make things worse, SUM is a common agg..

@polyfractal
Copy link
Contributor

As a stop-gap solution in the mean time, could SQL embed a value_count agg next to the sum (or really any aggregation that needs to know if it's null or not) and use that value as the indicator? E.g. if value_count == 0 there were definitely no values in that bucket.

Not saying we shouldn't improve the internal agg representations, just that it'd be a 7.x+ improvement and older versions would still be stuck. :(

@Skeeve
Copy link

Skeeve commented Dec 9, 2019

Is there anything one can do to work around this?

@Skeeve
Copy link

Skeeve commented Dec 9, 2019

Is there anything one can do to work around this?

Answering my own question. I think this should work:

AVG(field)*COUNT(0) == SUM(field)

A bit overhead, but for my use case it's not so much of a change.

I had

     (sum(content_user_filter)+sum(content_filter))/sum(total_volume)*100

Since COUNT(0)/COUNT(0) cancel itself out, I have now:

     (avg(content_user_filter)+avg(content_filter))/avg(total_volume)*100

@costin
Copy link
Member

costin commented Dec 9, 2019

Neither solution is ideal as it requires either two aggs or an approximation.
Until things get sorted out, a workaround is to require sum to be a stats agg and thus require its promotion and extraction from said agg.

@matriv
Copy link
Contributor

matriv commented Mar 30, 2020

@elastic/es-ql

@rjernst rjernst added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:QL (Deprecated) Meta label for query languages team labels May 4, 2020
@astefan
Copy link
Contributor Author

astefan commented Jun 4, 2020

This is how PostgreSQL is behaving regarding SUM, for the following data:

INSERT INTO public.test (val, gender) VALUES (0, 'M');
INSERT INTO public.test (val, gender) VALUES (0, 'F');
INSERT INTO public.test (val, gender) VALUES (0, 'M');
INSERT INTO public.test (val, gender) VALUES (0, 'M');
INSERT INTO public.test (val, gender) VALUES (0, 'M');
INSERT INTO public.test (val, gender) VALUES (0, 'F');
INSERT INTO public.test (val, gender) VALUES (10000, 'F');
INSERT INTO public.test (val, gender) VALUES (10000, 'M');
INSERT INTO public.test (val, gender) VALUES (10000, 'F');
INSERT INTO public.test (val, gender) VALUES (10000, NULL);

For the query SELECT SUM(val) FROM test WHERE val=123;, the result is:

postgres=# SELECT SUM(val) FROM test WHERE val=123;
  sum
--------
 (null)
(1 row)

Obviously, for the query that is explicitly looking at 0 values SELECT SUM(val) FROM test WHERE val=0;:

postgres=# SELECT SUM(val) FROM test WHERE val=0;
 sum
-----
   0
(1 row)

@nik9000 @polyfractal

@nik9000
Copy link
Member

nik9000 commented Jun 12, 2020

I'm sold. I think it'd be pretty breaking for us to return null for summing no data now but I'd be behind returning a "no data" flag. SQL could switch to null and folks using REST could look at the flag if they wanted to know.

@Skeeve
Copy link

Skeeve commented Jun 12, 2020

@astefan and @nik9000 The given queries have (I think) nothing to do with the problem.

SELECT SUM(val) FROM test WHERE val=123;

summing no row of course should give null

SELECT SUM(val) FROM test WHERE val=0;

summing 0s of course should give 0

But try this:

INSERT INTO public.test (val, gender) VALUES (NULL, 'M');
SELECT SUM(val) FROM test WHERE gender='M';

In elastic search this'll give NULL as one row contains NULL as a val while postgres will give 10000.

@costin
Copy link
Member

costin commented Jun 12, 2020

@Skeeve the ticket is about how Elasticsearch SUM aggregation works - the Postgres queries are used to validate the behavior in the wild.

In elastic search this'll give NULL as one row contains NULL as a val while postgres will give 10000.

Not sure where you got that 10000 from.
The query below on Postgres 10 returns:

CREATE TABLE test (
    val        int,
    gender     varchar(40)
);
INSERT INTO test (val, gender) VALUES (NULL, 'M');
SELECT * FROM test;
SELECT val FROM test WHERE gender='M';
SELECT SUM(val) FROM test WHERE gender='M';

returns:

val gender
null M
val
null
sum
null

@Skeeve
Copy link

Skeeve commented Jun 12, 2020

@costin Of course I was working with the previous example. You misse:

INSERT INTO public.test (val, gender) VALUES (0, 'M');
INSERT INTO public.test (val, gender) VALUES (0, 'F');
INSERT INTO public.test (val, gender) VALUES (0, 'M');
INSERT INTO public.test (val, gender) VALUES (0, 'M');
INSERT INTO public.test (val, gender) VALUES (0, 'M');
INSERT INTO public.test (val, gender) VALUES (0, 'F');
INSERT INTO public.test (val, gender) VALUES (10000, 'F');
INSERT INTO public.test (val, gender) VALUES (10000, 'M');
INSERT INTO public.test (val, gender) VALUES (10000, 'F');
INSERT INTO public.test (val, gender) VALUES (10000, NULL);

Issue is, afair: elasticsearch wrongly gets "null" as a sum when at leas one row has null.

UPDATE: I think I mixed it up a bit… I seem to remember: A sum of Zeros in ES gave null… Sorry.

@costin
Copy link
Member

costin commented Jun 12, 2020

I can't reproduce this - the issue is the opposite, SUM is always returning a not-null value (0) even not matching any documents. Essentially the opposite of what you mention.
If you encounter this behavior it is a bug so please do report it.

@botasky11
Copy link

mark

palesz pushed a commit to palesz/elasticsearch that referenced this issue Dec 3, 2020
Previously the SUM(all zeroes) was `NULL`, but after this change the SUM
SQL function call is automatically upgraded into a `stats` aggregation
instead of a `sum` aggregation. The `stats` aggregation only results in
`NULL` if the there were no rows, no values to aggregate, which is the
expected behaviour across different SQL implementations.

This is a workaround for elastic#45251 .
palesz pushed a commit that referenced this issue Dec 9, 2020
Previously the SUM(all zeroes) was `NULL`, but after this change the SUM
SQL function call is automatically upgraded into a `stats` aggregation
instead of a `sum` aggregation. The `stats` aggregation only results in
`NULL` if the there were no rows, no values (all nulls) to aggregate, which 
is the expected behaviour across different SQL implementations.

This is a workaround for the issue #45251 . Once the results of the `sum` 
aggregation can differentiate between `SUM(all nulls)` and 
`SUM(all zeroes`) the optimizer rule introduced in this commit needs to be
removed.
palesz pushed a commit to palesz/elasticsearch that referenced this issue Dec 9, 2020
Previously the SUM(all zeroes) was `NULL`, but after this change the SUM
SQL function call is automatically upgraded into a `stats` aggregation
instead of a `sum` aggregation. The `stats` aggregation only results in
`NULL` if the there were no rows, no values (all nulls) to aggregate, which
is the expected behaviour across different SQL implementations.

This is a workaround for the issue elastic#45251 . Once the results of the `sum`
aggregation can differentiate between `SUM(all nulls)` and
`SUM(all zeroes`) the optimizer rule introduced in this commit needs to be
removed.

(cherry-picked from b74792a)
palesz pushed a commit that referenced this issue Dec 9, 2020
Previously the SUM(all zeroes) was `NULL`, but after this change the SUM
SQL function call is automatically upgraded into a `stats` aggregation
instead of a `sum` aggregation. The `stats` aggregation only results in
`NULL` if the there were no rows, no values (all nulls) to aggregate, which
is the expected behaviour across different SQL implementations.

This is a workaround for the issue #45251 . Once the results of the `sum`
aggregation can differentiate between `SUM(all nulls)` and
`SUM(all zeroes`) the optimizer rule introduced in this commit needs to be
removed.

(cherry-picked from b74792a)
fixmebot bot referenced this issue in VectorXz/elasticsearch Apr 22, 2021
fixmebot bot referenced this issue in VectorXz/elasticsearch May 28, 2021
@Luegg
Copy link
Contributor

Luegg commented Jun 21, 2021

I think this has been addressed by #65796 and is fixed since 7.11.

Given the data:

POST http://localhost:9201/nullaggs/_bulk
Content-Type: application/json

{"index": {}}
{"name": "null", "val": null}
{"index": {}}
{"name": "null", "val": null}
{"index": {}}
{"name": "zero", "val": 0}
{"index": {}}
{"name": "zero", "val": 0}

The query select name, count(*), sum(val) from nullaggs group by name returns

     name      |   count(*)    |   sum(val)    
---------------+---------------+---------------
null           |2              |null           
zero           |2              |0          

Also, the queries with individual filters for name = 'null' and name = 'zero' return the expected values (tested on 7.13.2).

@matriv
Copy link
Contributor

matriv commented Jun 21, 2021

@Luegg That's true but the issue is still open because it's currently addressed with a workaround on SQL side: https://github.com/elastic/elasticsearch/pull/65796/files#diff-f89f100e88759827c9f9cbceed83a4efb475521a3e42d489584917807df11190R987 but the ideal solution would be that this is fixed in the Aggs.

@costin
Copy link
Member

costin commented Jun 21, 2021

To avoid confusion in the future, let's open an issue for the aggregation team (if one doesn't already exist), link it to this ticket for future reference and close this ticket down.
Right now it's not obvious that this issue has been fixed (even if through a workaround) on the SQL side. Since that is the case, the issue (and the other issues related to it) should be closed.

@matriv
Copy link
Contributor

matriv commented Jun 21, 2021

+1
But maybe it's worth updating those comments in the code introduced with #65796 to point to the new issue that will be opened.

@Luegg
Copy link
Contributor

Luegg commented Jun 22, 2021

I've found the pretty recent #71582 covering the issue on the aggregation side. Since the change requires a migration path it seems to not be of high priority.

Another possible alternative besides changing the sum behavior or adding an isEmpty flag in the response might be to introduce a "null safe" version of sum, something like a sum_or_null aggregation.

Luegg pushed a commit to Luegg/elasticsearch that referenced this issue Jun 22, 2021
Since the SQL `SUM` function behaves as expected, elastic#45251 can be closed.
As soon as elastic#71582 is resolved, we can go back to using the
`sum` aggregation instead of `stats`.
Luegg pushed a commit that referenced this issue Jun 23, 2021
Since the SQL `SUM` function behaves as expected, #45251 can be closed.
As soon as #71582 is resolved, we can go back to using the
`sum` aggregation instead of `stats`.
@Luegg Luegg linked a pull request Jun 23, 2021 that will close this issue
@Luegg Luegg closed this as completed Jun 23, 2021
fixmebot bot referenced this issue in VectorXz/elasticsearch Aug 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations :Analytics/SQL SQL querying >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:QL (Deprecated) Meta label for query languages team
Projects
None yet
Development

Successfully merging a pull request may close this issue.