Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash for Array #4802

Open
jayzhan211 opened this issue Sep 9, 2023 · 1 comment
Open

Hash for Array #4802

jayzhan211 opened this issue Sep 9, 2023 · 1 comment
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@jayzhan211
Copy link
Contributor

jayzhan211 commented Sep 9, 2023

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

While implementing apache/datafusion#7353, I found we might need hash() for ArrayRef.
If there is no existing function, having hash function in arrow-rs for ArrayRef would be a good idea.

Reference code where we need hash() for ScalarValue::List(ArrayRef)
https://github.com/apache/arrow-datafusion/blob/495c25f7d8ac2e9c7c82306f2c0967a766342c8b/datafusion/common/src/scalar.rs#L604C14-L607

Describe the solution you'd like

Describe alternatives you've considered

Additional context

@jayzhan211 jayzhan211 added the enhancement Any new improvement worthy of a entry in the changelog label Sep 9, 2023
@tustvold
Copy link
Contributor

tustvold commented Sep 9, 2023

Just spitballing here but perhaps we could remove Hash from ScalarValue? Collecting ScalarValue in this way will be terrible from a performance standpoint?

Edit: I had a brief play at doing this, and think it will be hard to remove from DF. We don't currently provide Hash utilities for arrays in arrow-rs, but it should be possible to build something in DF making use of https://docs.rs/datafusion/latest/datafusion/physical_expr/hash_utils/fn.create_hashes.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

2 participants