-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(arrow-array): add from_iter
for DictionaryArray
for building BinaryDictionaryArray
#6922
base: main
Are you sure you want to change the base?
Conversation
… BinaryDictionaryArray
it.for_each(|i| { | ||
builder | ||
.append(i) | ||
.expect("Unable to append a value to a dictionary array."); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Extend<Option<V>>
uses append_option. Hence the two FromIterator impls here. 👍🏼
impl<'a, T: ArrowDictionaryKeyType> FromIterator<Option<&'a [u8]>> for DictionaryArray<T> { | ||
fn from_iter<I: IntoIterator<Item = Option<&'a [u8]>>>(iter: I) -> Self { | ||
let it = iter.into_iter(); | ||
let (lower, _) = it.size_hint(); | ||
let mut builder = BinaryDictionaryBuilder::with_capacity(lower, 256, 1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another FromIterator<Option<&'a str>> for DictionaryArray<T>
for strings, which at first glance looks possible to combine. (Both use the GenericByteDictionaryBuilder with either strings, or bytes). Have you tried that, and do you think it's worthwhile?
TLDR is that the idea is a good one -- thank you @rluvaton and @wiedld for the review In general, I think the most important thing is to make sure this isn't a breaking change, otherwise we can't merge it until Feb due to the current release schedule: https://github.com/apache/arrow-rs?tab=readme-ov-file#release-versioning-and-schedule |
Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look |
Which issue does this PR close?
make it easier to write tests for #6923
Rationale for this change
You can't build
BinaryDictionaryArray
from iterator of bytesWhat changes are included in this PR?
added
from_iter
from iterator of&[u8]
and from iterator ofOption<&[u8]>
Are there any user-facing changes?
Yes.
There are breaking changes, see the failing example below with the fix:
I'll fix the broken tests when I know this change is desired