-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not concatenate identical dictionaries #1219
Conversation
Looks good to me. I initially thought |
@@ -1155,6 +1155,41 @@ impl ArrayData { | |||
Ok(()) | |||
}) | |||
} | |||
|
|||
/// Returns true if this `ArrayData` is equal to `other`, using pointer comparisons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really nice -- thank you @tustvold
arrow/src/array/data.rs
Outdated
|
||
/// Returns true if this `ArrayData` is equal to `other`, using pointer comparisons | ||
/// to determine buffer equality. This is cheaper than `PartialEq::eq` but may | ||
/// return false negatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in what case would this return a false negative (to the "are these two pointers the same" question)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may say two things are not equal when logically they are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the wording
assert!( | ||
array_copy.data().child_data()[0].ptr_eq(&combined.data().child_data()[0]) | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also add a test of concatenating three dictionaries -- where 2 use the same dictionary and one is a different dictionary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
Codecov Report
@@ Coverage Diff @@
## master #1219 +/- ##
=======================================
Coverage 82.70% 82.71%
=======================================
Files 175 175
Lines 51711 51787 +76
=======================================
+ Hits 42769 42837 +68
- Misses 8942 8950 +8
Continue to review full report at Codecov.
|
|
||
/// Returns true if this `ArrayData` is equal to `other`, using pointer comparisons | ||
/// to determine buffer equality. This is cheaper than `PartialEq::eq` but may | ||
/// return false when the arrays are logically equal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Which issue does this PR close?
Closes #504
Rationale for this change
See ticket
What changes are included in this PR?
Adds
ArrayData::ptr_eq
and then uses this to elide dictionary concatenation in MutableArrayDataAre there any user-facing changes?
This changes the behaviour of concat, to no longer concatenate the dictionaries if they are the same. Not sure if this counts