-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ScalarValue an ArrayRef Wrapper #7353
Comments
FWIW using an ArrayRef for a single scalar value may be substantially more overhead than todays implementation, though the fact they can be copied cheaply might make it a wash In general however, I think the basic idea is a great one and will simplify many things. Perhaps we can start at the edges: change code that today operates on |
ScalarValue::List(Option<Vec> will change to something like ScalarValue::List(ArrayRef)? It would be really nice if we have ArrayRef for ScalarValue for #7242. The reason why I need to introduce ColumnValue::Array for |
#7352 could be a first step but eventually it would be good to just use an ArrayRef of length 1 for all scalars, inline with the design in arrow-rs |
@tustvold It seems there is no prost implementation for ArrayRef. Should it be introduced in arrow-rs? |
I think the idea would be to use arrow-ipc |
I personally think there is great value in using My rationale is that the overhead for converting |
Actually, I am working on List only, which is this link, I paste the incorrect one. #7352 |
arrow-ipc converts data to I think there are two approaches If we remain from_proto / to_proto for ScalarValue::List , we can get the ipc-message from arrow-ipc and store those Vec in prost::Message. Not sure whether we can do this but since they are all Vec, so probably doable. The other way is we introduce from_ipc / to_ipc for ScalarValue::List and no prost for ScalarValue::List, and use @tustvold How do you think about these two approaches? |
https://github.com/apache/arrow-datafusion/actions/runs/6362768232/job/17278040087?pr=7629 Anyone know how to fix this kind of CI?
|
Maybe I could just convert ArrayRef to Vec, probably the easiest fix. |
I am not sure that would work for all array types (like List, for example) |
Fail to handle nested list. We should directly handle Array to proto/ipc-like things. |
I found this quite easy :) |
Is your feature request related to a problem or challenge?
We have recently standardised a scalar abstraction upstream, based around the existing Array abstractions.
The rationale behind this is expanded upon here apache/arrow-rs#4393 (comment)
Describe the solution you'd like
I would like ScalarValue to just be a wrapper around an
ArrayRef
. This would avoid a huge amount of duplicated and/or boilerplate code, whilst also discouraging implementing non-vectorizable kernels based around ScalarValue.Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: