-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove special casting of Min
/ Max
built in AggregateFunctions
#11151
Comments
Do you think we should implement a special is_min/is_max since these are pervasive and used also for statistics ? |
It might make sense to special case What we did in the case of ScalarUDFs was to go with functions that described the property in question was rather than what the function was. So for example, for I was thinking we would do something similar in this case. However, it if turns out that all such properties are only relevant for min/max maybe having a |
@alamb I think this has been solved? |
This is not, we need to address the specialization check in aggregate_statistic |
Specifically I think these checks need to be removed: datafusion/datafusion/physical-optimizer/src/aggregate_statistics.rs Lines 269 to 292 in 4838cfb
|
Here is one specific suggestion: #12296 (comment) |
Is your feature request related to a problem or challenge?
Part of #10943
While trying to port
min
andmax
to UDFs in #11013, @edmondop found several places whereMin
andMax
(the existing built in aggregate functions) are special casedHere is
Max
:datafusion/datafusion/physical-expr/src/aggregate/min_max.rs
Lines 77 to 82 in c2ea6b3
The problem with relying on
Min
andMax
directly is that it is hard/impossible to switch Min/Max to be a user defined aggregates as seen in #11013Also, relying on Min/Max directly means that there is certain behavior that is not available to UDAFs compared to build in aggregate functions, which isn't ideal
Describe the solution you'd like
Remove all explicit references to
Min
/Max
Specifically, code that looks like this should be removed
and
Describe alternatives you've considered
Ideally, the alternative is to use a function on
AggregateExpr
which we can then add/implement for UDAFs when we port min/max to be a UDAFTask List
Min
/Max
references fromAggregateExec::get_minmax_descr
#11152Min
/Max
references fromAggregateStatistics
#11153Additional context
No response
The text was updated successfully, but these errors were encountered: