-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(arrow-array): add BinaryArrayType
similar to StringArrayType
#6924
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question. What does this provide which is missing with the current IntoIterator implementations for &'a GenericByteArray, &'a FixedSizeBinaryArray, and &'a BinaryViewArray?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question above is because I think this may not be needed? But I wasn't sure if I was missing something. This introduced trait converts a referenced array to an ArrayIter, and it does so by abstracting over these three generic types (GenericBinaryArray, BinaryViewArray, FixedSizeBinaryArray) calling the When I looked at the existing code, we have a &'a GenericByteArray which also takes a reference and returns ArrayIter. So I wasn't sure why this was needed -- and what it added by making the |
I am not sure -- that is why I was asking for an example of this trait being used -- then we can have a concrete discussion if this API is duplicative or not |
I think the idea is that |
IMO this looks great. I was just trying to figure out what this provided that fn use_binary_array<'a>(array: impl arrow_array::ArrayAccessor<Item = &'a [u8]>) {
// Have to manually iterate over the array
for i in 0..array.len() {
dbg!(String::from_utf8(array.value(i).to_vec()).unwrap());
}
}
#[test]
fn binary_array() {
let mut builder = BinaryViewBuilder::new();
builder.append_value(b"foo");
builder.append_value(b"bar");
builder.append_value(b"baz");
use_binary_array(&builder.finish());
let mut builder = FixedSizeBinaryBuilder::new(3);
builder.append_value(b"foo").unwrap();
builder.append_value(b"bar").unwrap();
builder.append_value(b"baz").unwrap();
use_binary_array(&builder.finish());
} |
The above example use case (kindly provided by @kylebarron) -- is already passing on main. 🤔 |
So perhaps the conclusion is that What can we do to make it clearer to find? Maybe we can add this example on |
Well, on the contrary I was saying that |
Marking as draft as I don't think this PR is waiting on feedback anymore -- it needs some test / use in the crate otherwise it risks bitrotting / maybe not working as designed. Please mark it ready for review when it is ready for another look |
Rationale for this change
This trait helps to abstract over the different types of binary arrays
so that we don't need to duplicate the implementation for each type.
Same as
StringArrayType
What changes are included in this PR?
Added
BinaryArrayType
and implemented forBinaryArray
,LargeBinaryArray
,BinaryViewArray
andFixedSizeBinaryArray
Are there any user-facing changes?
Yes, there is a new public trait