Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove undefined behavior in value method of boolean and primitive arrays #644

Merged
merged 2 commits into from
Aug 3, 2021

Conversation

Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Jul 31, 2021

Which issue does this PR close?

Closes #645

Rationale for this change

The value methods only had debug_assert! in them, which made that the check is removed in release mode.

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jul 31, 2021
@@ -100,7 +100,7 @@ impl BooleanArray {
///
/// Note this doesn't do any bound checking, for performance reason.
pub fn value(&self, i: usize) -> bool {
debug_assert!(i < self.len());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UB 1

#[inline]
pub fn value(&self, i: usize) -> T::Native {
debug_assert!(i < self.len());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UB 2

pub fn value(&self, i: usize) -> &str {
assert!(i < self.data.len(), "StringArray out of bounds access");
//Soundness: length checked above, offset buffer length is 1 larger than logical array length
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleanup

let comparison = (0..$left.len()).map(|i| $op($left.value(i), $right.value(i)));
// Safety:
// `i < $left.len()` and $left.len() == $right.len()
let comparison = (0..$left.len())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid perf. penalty by using unsafe version here

@@ -121,8 +124,10 @@ macro_rules! compare_op_primitive {
macro_rules! compare_op_scalar {
($left: expr, $right:expr, $op:expr) => {{
let null_bit_buffer = $left.data().null_buffer().cloned();

let comparison = (0..$left.len()).map(|i| $op($left.value(i), $right));
// Safety:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid perf. penalty by using unsafe version here

@Dandandan Dandandan changed the title Remove UB in value method Remove undefined behavior in value method of boolean and primitive arrays Jul 31, 2021
@codecov-commenter
Copy link

Codecov Report

Merging #644 (37a5594) into master (a08b939) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #644      +/-   ##
==========================================
+ Coverage   82.48%   82.51%   +0.02%     
==========================================
  Files         167      168       +1     
  Lines       46450    47227     +777     
==========================================
+ Hits        38315    38968     +653     
- Misses       8135     8259     +124     
Impacted Files Coverage Δ
arrow/src/array/array_boolean.rs 94.01% <100.00%> (ø)
arrow/src/array/array_primitive.rs 94.60% <100.00%> (ø)
arrow/src/array/array_string.rs 97.81% <100.00%> (-0.04%) ⬇️
arrow/src/compute/kernels/comparison.rs 95.84% <100.00%> (ø)
arrow/src/array/transform/boolean.rs 76.92% <0.00%> (-7.70%) ⬇️
arrow/src/array/equal_json.rs 88.69% <0.00%> (-2.52%) ⬇️
parquet/src/arrow/schema.rs 87.35% <0.00%> (-1.18%) ⬇️
arrow/src/compute/kernels/filter.rs 90.82% <0.00%> (-1.17%) ⬇️
arrow/src/datatypes/datatype.rs 66.08% <0.00%> (-0.74%) ⬇️
arrow/src/array/equal/mod.rs 93.45% <0.00%> (-0.48%) ⬇️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a08b939...37a5594. Read the comment docs.

Copy link
Contributor

@paddyhoran paddyhoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @Dandandan

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a nice cleanup to me. @jorgecarleitao would you like to review this PR as well?

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, good improvements. Thanks!

@nevi-me nevi-me merged commit 6bf1988 into apache:master Aug 3, 2021
alamb pushed a commit that referenced this pull request Aug 5, 2021
…arrays (#644)

* Remove UB in `value`

* Add safety note
alamb added a commit that referenced this pull request Aug 8, 2021
…arrays (#644) (#668)

* Remove UB in `value`

* Add safety note

Co-authored-by: Daniël Heres <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove undefined behavior in value method of boolean and primitive arrays
6 participants