-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hll_add_agg() should return an hll_empty() when aggregating over an empty set #2
Comments
I think the return value for |
@dimitri After looking at the other PG aggregate functions, all of them return |
I think it's worth it, and explained why in the reference blog article. Basically when you union an existing value over the aggregate on an empty set, you lose your current HLL value. That can't be what you want to happen. |
@TimonK I don't think we'd be breaking consistency here. The PostgreSQL manual says that "except for count, aggregate functions return a null value when no rows are selected." This is consistent with count(expression)'s definition where the return value is "the number of input rows for which the value of expression is not null." The higher level implications for us is, when we run count(Distinct null_returning_expression), we get 0 in PostgreSQL. When we use postgresql-hll functions, we get back Null. |
@ozgune hll is estimating counts, so i think it should follow "count" semantics and return |
I'd like to submit a pull request for this if everyone thinks it makes sense. Here's what I had in mind. First, I thought I'd modify The proper way to do this is to modify The proposed change modifies |
… when aggregating over empty sets. This makes hll_add_agg() conform to PostgreSQL's count() semantics over empty sets, and closes issue citusdata#2. Also added two small regression tests.
This is fixed in the newest version, 2.8.0. |
Thanks! |
hll_add_agg
currently returnsNULL
when it aggregates over an empty data set, which can have the undesirable effect of NULLing out the target when the aggregate is used to update anotherhll
column (withhll_union
).The issue is avoidable with some
NULL
detection (as is shown in this blog post) but the post is right in that the real fix should be inpostgresql-hll
.The text was updated successfully, but these errors were encountered: