Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MutableArrayData::extend_nulls doesn't update null bitmask #1230

Closed
tustvold opened this issue Jan 23, 2022 · 2 comments · Fixed by #4343
Closed

MutableArrayData::extend_nulls doesn't update null bitmask #1230

tustvold opened this issue Jan 23, 2022 · 2 comments · Fixed by #4343
Assignees
Labels

Comments

@tustvold
Copy link
Contributor

tustvold commented Jan 23, 2022

Describe the bug

Noticed whilst working on #1225, calling MutableArrayData::extend_nulls doesn't update the null bitmask. This means the final array may have a shifted or completely invalid bitmask.

#1225 changed it so that if MutableArrayData is created with nulls disabled, it panics if you try to append nulls. Unfortunately it would appear this is being exploited by ArrowArrayReader to avoid computing a bitmask.

To Reproduce

#[test]
    fn test_nulls() {
        let ints: ArrayRef = Arc::new(Int32Array::from(vec![1]));
        let mut data = MutableArrayData::new(vec![ints.data()], true, 3);
        data.extend_nulls(9);
        let data = data.freeze();
        
        assert_eq!(data.len(), 9);
        assert_eq!(data.null_buffer().unwrap().len(), 2);
    }

Expected behavior

Appending nulls should update the null bitmask, and it should panic if MutableArrayData is created without null support. Fixing this is likely blocked on #1197 as it relies on this behaviour

@HaoYang670
Copy link
Contributor

HaoYang670 commented Feb 9, 2022

Hi, I am curious about what is index used to do here:

    /// Extends this [MutableArrayData] with elements from the bounded [ArrayData] at `start`
    /// and for a size of `len`.
    /// # Panic
    /// This function panics if the range is out of bounds, i.e. if `start + len >= array.len()`.
    pub fn extend(&mut self, index: usize, start: usize, end: usize) {
        let len = end - start;
        (self.extend_null_bits[index])(&mut self.data, start, len);
        (self.extend_values[index])(&mut self.data, index, start, len);
        self.data.len += len;
    }

@tustvold
Copy link
Contributor Author

tustvold commented Feb 9, 2022

It's for when MutableArrayData is created from multiple source arrays, and is used to select which one to copy values from. The concat kernel makes use of this functionality at the moment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants