-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DISCUSS] organization of functions #9100
Comments
FYI @Weijun-H and @jayzhan211 -- do you have any thoughts about moving the array functions into their own crate? |
I think it is a good idea to move array functions into their crate given the size of it. It will become more complex since there are still tons of array functions not yet implemented. For other functions like string, if they are large enough, we should move them into subcrate as well. I think we should keep the core as simple as possible. Core functions crate like the standard library in rust, others go to sub-crate like third-party crates |
It is timely to address the issues with array_experssion.rs, as @jayzhan211 stated that a lot of work still needs to be done on the array. I strongly support this ticket. |
I whipped up an example of how an |
Minor: can we match the feature flag prefix with the source code location? Just seems cleaner to me. string_expressions unicode_expressions regexp_expressions .. etc |
@Omega359 asked a good question on #8705 (comment) that I wanted to not get lost in comments on a merged pull request
I think they should go in the Following the model of the existing
I think we should do this if/when possible -- benchmarks that depend on |
For anyone following along, we just merged #9113 to break out the array functions crate What I plan to do this week is to make a few other PRs moving some functions so we have some more examples, and then file a bunch of "help wanted / good first issue tickets" to port the remaining functions per the list on this ticket |
Note for myself: I tried to pull MakeArray into |
I filed #9285 to track the actual work, let's continue the discussion there |
please add table functions, eg. generate_series |
Seems like this request was made in #9323 ✅ -- thank you |
Is your feature request related to a problem or challenge?
As we break the function library out of the DataFusion core as part of #8045, we need to organize it somehow
As pointed out by @viirya on #8705 (comment) there are many potential ways to organize the data fusion function library. Thus I would like to get some consensus on how we want the organization to look before creating tickets and starting to crank it out
Describe the solution you'd like
Here is a proposal for how the functions are organized.
math functions
math_expressions
datafusion/functions/src/math
Isnan
: is the value NaNNanvl
: return the first non-NaN valuearray functions
Given the size and specialization of these functions I propose putting them into their own subcrate
array_expressions
datafusion/functions-array/src/math
MakeArray
: construct an array from columns (union/except depends on this)Core functions
These functions are always available as they are used for internal purposes (like implementing
[1,2,3]
syntax in SQL or so commonly used that it is not worth having a feature flag)datafusion/functions/src/core
or similarCoalesce
: return the first non-null valueStruct
: Create a structNullIf
: return null if the two values are equalRandom
: return a random numberArrowTypeOf
: return the arrow type of a valueCrypto functions
crypto_expressions
datafusion/functions/src/crypto
String functions
string_expressions
datafusion/functions/src/string
Unicode string functions
These expressions need an additioanl dependency, which is why they have a different flag)
unicode_expressions
datafusion/functions/src/string/unicode
regex functions
regex_expressions
datafusion/functions/src/regexp
date time function
datetime_expressions
datafusion/functions/src/datetime
Describe alternatives you've considered
We can have more fine grained crates, or different organizations, etc
For example, perhaps we should pull the string functions into
datafusion/functions-string
crateAdditional context
No response
The text was updated successfully, but these errors were encountered: