Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporal datatype support for interval arithmetic #5971

Merged
merged 67 commits into from
Apr 13, 2023
Merged

Temporal datatype support for interval arithmetic #5971

merged 67 commits into from
Apr 13, 2023

Conversation

berkaysynnada
Copy link
Contributor

Which issue does this PR close?

Closes #5844.

Rationale for this change

Currently, interval arithmetic operations are not supported for temporal datatypes. It is especially necessary to be able to apply symmetric hash join over ordered timestamp columns, probably with some interval differences.

With this PR, we can make use of interval arithmetic with Interval-Interval and Timestamp-Timestamp intervals. Also, evalute_array function (where array vs scalar operations go to in datetime.rs) had a limited evaluation ability. It has a similar behavior now with that feature .

What changes are included in this PR?

evaluate_array is extended to make timestamp vs timestamp, timestamp vs interval, and interval vs interval calculations, and dangerous unwraps are removed. min/max of different interval types can be found now. evaluate_bounds and propagate_constraints are implemented for DateTimeIntervalExpr.

Are these changes tested?

Yes, symmetric hash join tests are extended with temporal datatypes.

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt) labels Apr 12, 2023
Copy link
Contributor

@mustafasrepo mustafasrepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR passed from our internal review process. It is LGTM!

@mustafasrepo
Copy link
Contributor

I plan to merge this PR, If there is no additional review from the community. if there is anyone want to review, please let me know.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @berkaysynnada and @mustafasrepo -- I didn't review this PR fully but I think the structure looks good to me.

My biggest concern in the entire exercise is that the arithmetic that is supported for ScalarValue and what is supported for Arrays (and what is supported upstream in arrow-rs) will diverge. However I think we are all agreed on the desired end state (kernels in arrow-rs and the same math / support for scalars in Datafusion)

let secs = ts_ms / 1000;
let nsecs = ((ts_ms % 1000) * 1_000_000) as u32;
do_date_time_math(secs, nsecs, scalar, sign).map(|dt| dt.timestamp_millis())
let secs = ts_ms.div_euclid(1000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL div_euclid

@mustafasrepo mustafasrepo merged commit bd705fe into apache:main Apr 13, 2023
@berkaysynnada
Copy link
Contributor Author

Thank you @berkaysynnada and @mustafasrepo -- I didn't review this PR fully but I think the structure looks good to me.

My biggest concern in the entire exercise is that the arithmetic that is supported for ScalarValue and what is supported for Arrays (and what is supported upstream in arrow-rs) will diverge. However I think we are all agreed on the desired end state (kernels in arrow-rs and the same math / support for scalars in Datafusion)

Thank you for the review. Once you review and approve the structure, I will move the other arithmetic codes that are for Arrays, and finally, I believe we can unify all arithmetics in kernels_arrow.rs

@berkaysynnada berkaysynnada deleted the feature/sym-join-temporal-columns branch April 17, 2023 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Temporal datatype support for interval arithmetic
5 participants