-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporal datatype support for interval arithmetic #5971
Temporal datatype support for interval arithmetic #5971
Conversation
add tests macro will replace matches inside evaluate ready for review
You can add a millisecond array as well, but I used Nano.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR passed from our internal review process. It is LGTM!
I plan to merge this PR, If there is no additional review from the community. if there is anyone want to review, please let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @berkaysynnada and @mustafasrepo -- I didn't review this PR fully but I think the structure looks good to me.
My biggest concern in the entire exercise is that the arithmetic that is supported for ScalarValue and what is supported for Arrays (and what is supported upstream in arrow-rs) will diverge. However I think we are all agreed on the desired end state (kernels in arrow-rs and the same math / support for scalars in Datafusion)
let secs = ts_ms / 1000; | ||
let nsecs = ((ts_ms % 1000) * 1_000_000) as u32; | ||
do_date_time_math(secs, nsecs, scalar, sign).map(|dt| dt.timestamp_millis()) | ||
let secs = ts_ms.div_euclid(1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL div_euclid
Thank you for the review. Once you review and approve the structure, I will move the other arithmetic codes that are for Arrays, and finally, I believe we can unify all arithmetics in kernels_arrow.rs |
Which issue does this PR close?
Closes #5844.
Rationale for this change
Currently, interval arithmetic operations are not supported for temporal datatypes. It is especially necessary to be able to apply symmetric hash join over ordered timestamp columns, probably with some interval differences.
With this PR, we can make use of interval arithmetic with Interval-Interval and Timestamp-Timestamp intervals. Also, evalute_array function (where array vs scalar operations go to in datetime.rs) had a limited evaluation ability. It has a similar behavior now with that feature .
What changes are included in this PR?
evaluate_array is extended to make timestamp vs timestamp, timestamp vs interval, and interval vs interval calculations, and dangerous unwraps are removed. min/max of different interval types can be found now.
evaluate_bounds
andpropagate_constraints
are implemented forDateTimeIntervalExpr
.Are these changes tested?
Yes, symmetric hash join tests are extended with temporal datatypes.
Are there any user-facing changes?