-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement unnest
function
#6555
Comments
|
The return type of Unnest might be Vec<ArrayRef> or Vec<ColumnValue>, a set of rows (Vec) with an array of columns (ArrayRef). https://github.com/apache/arrow-datafusion/blob/9b419b19a66bdd35e9e5c0bca259786f8f3c3965/datafusion/expr/src/columnar_value.rs#L32-L38 Maybe we can add another type like RowsOfArray(Vec<ArrayRef>) for the rows-based return type? Are there other alternative return types of Unnest? I think the data layout is quite similar to ValuesExec, so we might need Vec of some things as the sets of rows. |
I found that we can apply |
Yes @jayzhan211, I know about this feature: (See PR: #5106). Also If we want to stick to the standard implementation of the function (MySQL, PostgreSQL and other),I think we should implement the scalar function as well. @jackwener @alamb @tustvold What do you think about it? |
I don't understand what a Thus I wonder if the SQL planner (or maybe some optimizer pass) could replace all instances of |
I have question to unnest multi-columns.
Expected result
For those Unnest, apply some kind of |
Maybe you could use |
take |
I plan to implement this feature. |
Is your feature request related to a problem or challenge?
Follow on to #6384
It would be nice to implement
unnest
function (with the properties like the analog in PostgreSQL) inarrow-datafusion
.Describe the solution you'd like
Main benefits for adding this feature:
unnest
function we can use aggregate functions for arrays:unnest
function serves as an exchange between arrays and columns, we have 2 cases of behavior:unnest
with single argumentunnest
with multiple argument (more than 1) (this form is only allowed in a query's FROM clause)Examples:
Describe alternatives you've considered
For aggregate functions, we can create a lot of individual functions for aggregate functions (like
array_sum
), but I think this implementation would be too redundant.Additional context
Similar Issues:
#6119
Similar PR:
#6384
#5106
Links to sources:
https://www.postgresql.org/docs/current/functions-array.html
The text was updated successfully, but these errors were encountered: