Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similar to the "count-bug" case that produces incorrect results #15032

Open
suibianwanwank opened this issue Mar 5, 2025 · 1 comment
Open
Assignees
Labels
bug Something isn't working

Comments

@suibianwanwank
Copy link

Describe the bug

For cases similar to the well-known "count-bug", the correlated subquery returns the incorrect result.

To Reproduce

> select * from 'data.csv';
+------+---+
| a    | b |
+------+---+
| 1    | 2 |
| NULL | 0 |
+------+---+
2 row(s) fetched. 
Elapsed 0.003 seconds.

> select e.b ,(select case when max(e2.a) > 10 then 'a' else 'b' end from 'data.csv' e2 where e2.b = e.b+1 ) from  'data.csv' e;
+---+-------------------------------------------------------------------+
| b | CASE WHEN max(e2.a) > Int64(10) THEN Utf8("a") ELSE Utf8("b") END |
+---+-------------------------------------------------------------------+
| 0 | NULL                                                              |
| 2 | NULL                                                              |
+---+-------------------------------------------------------------------+
2 row(s) fetched. 
Elapsed 0.004 seconds.

Expected behavior

Subqueries with no matches should return 'b' instead of null.

Additional context

What is count-bug: Optimization of Nested SQL Queries Revisited
A paper describing this issue: Parameterized ueries and Nesting Equivalences

The well-known "count-bug" is not specific to the count aggregate, and outer-join does not solve it.
The anomaly can occur on any aggregate function; aggregates need modification to distiguish empty set from null values; and optimizing out the outerjoin depends on utilization context

@suibianwanwank suibianwanwank added the bug Something isn't working label Mar 5, 2025
@suibianwanwank
Copy link
Author

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant