Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid Value Offsets Slice For Empty Variable Length Arrays #1824

Closed
tustvold opened this issue Jun 9, 2022 · 0 comments · Fixed by #2836
Closed

Invalid Value Offsets Slice For Empty Variable Length Arrays #1824

tustvold opened this issue Jun 9, 2022 · 0 comments · Fixed by #2836
Assignees
Labels

Comments

@tustvold
Copy link
Contributor

tustvold commented Jun 9, 2022

Describe the bug

I don't know if this was a change or it has always been this way, but we permit zero-length ListArray to have no offsets. In #1620 we explicitly decided against enforcing a single value.

Unfortunately the logic for value_offsets, duplicated in ListArray, StringArray, BinaryArray, etc...

unsafe {
    std::slice::from_raw_parts(
        self.value_offsets.as_ptr().add(self.data.offset()),
        self.len() + 1,
    )
}

Does not take this into account

To Reproduce

#[test]
fn test_string_array_empty_offsets() {
    let data = ArrayDataBuilder::new(DataType::Utf8)
        .len(0)
        .add_buffer(Buffer::from([]))
        .add_buffer(Buffer::from([]))
        .build()
        .unwrap();
    let array = GenericStringArray::<i32>::from(data);
    assert_eq!(array.value_offsets().len(), 0); // FAILS with 1 != 0
}

Expected behavior

Given the decision to support empty offsets, I think we should always return an empty slice for zero length array types, so that the behaviour is consistent.

Additional context

Discovered whilst working on #1811, so already getting returns from that investment 😁

@tustvold tustvold added the bug label Jun 9, 2022
@tustvold tustvold self-assigned this Jun 9, 2022
@tustvold tustvold changed the title value_offsets Returns Invalid Slice For Empty Offsets Buffer Invalid Value Offsets Slice For Empty Variable Length Arrays Jun 9, 2022
tustvold added a commit to tustvold/arrow-rs that referenced this issue Jun 9, 2022
Fix value_offsets for empty variable length arrays (apache#1824)
tustvold added a commit to tustvold/arrow-rs that referenced this issue Oct 6, 2022
tustvold added a commit to tustvold/arrow-rs that referenced this issue Oct 6, 2022
tustvold added a commit that referenced this issue Oct 13, 2022
* Handle empty offsets buffer (#1824)

* Review feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
1 participant