Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add round function #322

Merged
merged 8 commits into from
Feb 9, 2023
Merged

Conversation

vibhatha
Copy link
Contributor

@vibhatha vibhatha commented Sep 8, 2022

This PR adds round function.

@vibhatha vibhatha marked this pull request as ready for review September 8, 2022 14:59
@jvanstraten jvanstraten changed the title feat: Adding round function feat: add round function Sep 8, 2022
@jvanstraten
Copy link
Contributor

Where did this come from? Based on #307 I thought we weren't going to do rounding to multiple. Rounding to some number of digits is exactly the same thing, just restricted to powers of 10.

That rounding argument is also really weird, there's no such thing as an optional value argument and using an integer for this is really weird. If we are going to do a round-to-multiple I feel like we should just reuse the existing rounding mode option to determine how to round.

Based on #307 I would expect simply round(f32) -> f32 and trunc(f32) -> f32 (and likewise for f64).

@ianmcook
Copy link
Contributor

ianmcook commented Sep 8, 2022

I think we will want to implement rounding and tie-breaking options similar to the ones in Acero's RoundMode.

@vibhatha
Copy link
Contributor Author

vibhatha commented Sep 8, 2022

Where did this come from? Based on #307 I thought we weren't going to do rounding to multiple. Rounding to some number of digits is exactly the same thing, just restricted to powers of 10.

That rounding argument is also really weird, there's no such thing as an optional value argument and using an integer for this is really weird. If we are going to do a round-to-multiple I feel like we should just reuse the existing rounding mode option to determine how to round.

Based on #307 I would expect simply round(f32) -> f32 and trunc(f32) -> f32 (and likewise for f64).

Referring to: https://github.com/substrait-io/substrait/pull/230/files#diff-b6317d19776a0381379a5980e1be0ffcde22082a034b891cb7a072fdbab32327

@vibhatha
Copy link
Contributor Author

vibhatha commented Sep 8, 2022

@jvanstraten I was referring to the SQL definition here.

I see your point on function argument. In Duckdb there is no such parameter. May be I will remove that option and also add the rounding options here too.

WDYT? cc @ianmcook

@ianmcook
Copy link
Contributor

ianmcook commented Sep 8, 2022

IIUC, I think we can just implement a rounding mode option, and make TRUNCATE one of the available modes, alongside the others like DOWN, UP, TOWARDS_ZERO, etc. Does that sound right to you @jvanstraten ?

@jvanstraten
Copy link
Contributor

I figured we could just reuse the floating point rounding mode. I also generalized it to integers for quantile earlier. Wouldn't TRUNCATE and TOWARDS_ZERO be the same thing, anyway?

Note that dedicated ceil and floor functions already exist. I guess we should get rid of those if we go the option route.

@ianmcook
Copy link
Contributor

ianmcook commented Sep 8, 2022

Ah, yes, I think that all makes sense.

Since a lot of thought went into ensuring that Acero's RoundMode can represent the behavior of any other mainstream system's round function, I'll list out the options it provides and map them onto the options used in the Substrait functions that take a rounding argument. For more details about these Acero RoundModes, see the the round modes table here. To see an example showing the behavior of these different Acero RoundModes, look at this test.

  • DOWN - corresponds to Substrait's FLOOR
  • UP - corresponds to Substrait's CEILING
  • TOWARDS_ZERO - corresponds to Substrait's TRUNCATE
  • TOWARDS_INFINITY - no corresponding option in Substrait
  • HALF_DOWN - no corresponding option in Substrait
  • HALF_UP - no corresponding option in Substrait
  • HALF_TOWARDS_ZERO - no corresponding option in Substrait
  • HALF_TOWARDS_INFINITY - corresponds to Substrait's TIE_AWAY_FROM_ZERO
  • HALF_TO_EVEN - corresponds to Substrait's TIE_TO_EVEN
  • HALF_TO_ODD - no corresponding option in Substrait

So this raises some questions:

  • Should we add a Substrait option AWAY_FROM_ZERO? This corresponds to Acero's TOWARDS_INFINITY .
  • Should we add a Substrait option TIE_TOWARDS_NEGATIVE_INFINITY (or maybe TIE_DOWN to keep it shorter)? This corresponds to Acero's HALF_DOWN.
  • Should we add a Substrait option TIE_TOWARDS_POSITIVE_INFINITY (or maybe TIE_UP to keep it shorter)? This corresponds to Acero's HALF_UP.
  • Should we add a Substrait option TIE_TOWARDS_ZERO? This corresponds to Acero's HALF_TOWARDS_ZERO.
  • Should we add a Substrait option TIE_TO_ODD? This corresponds to Acero's HALF_TO_ODD.

@jvanstraten
Copy link
Contributor

Hm. I guess that makes sense for round, and adding more variants is cheap. I don't think it makes sense to do it for all floating point functions though, since IEEE is pretty explicit about what the possible rounding modes are.

@vibhatha
Copy link
Contributor Author

cc @jvanstraten I added the round options. Does it reflect what you expected?

@vibhatha vibhatha closed this Sep 19, 2022
@vibhatha vibhatha reopened this Sep 19, 2022
@jvanstraten
Copy link
Contributor

No, because you didn't add the rounding modes @ianmcook was talking about, and the supposedly-optional function argument still exists.

@vibhatha
Copy link
Contributor Author

@jvanstraten I added other options. Should we create documentation for this too? In this PR itself or a separate PR?

@jvanstraten
Copy link
Contributor

It should just be in the description of the function in this case; this option argument is basically specific to round. The rounding mode should probably be a required argument, too, since it makes little sense for a producer to allow a consumer to round however it wants for a function that is specifically about rounding (unlike the usual floating point rounding mode).

@vibhatha
Copy link
Contributor Author

@jvanstraten I had the doubt though. I will update it.

@CLAassistant
Copy link

CLAassistant commented Oct 6, 2022

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vibhatha when you get a moment can you update this to the new format for optional arguments?

@vibhatha
Copy link
Contributor Author

vibhatha commented Dec 2, 2022

@vibhatha when you get a moment can you update this to the new format for optional arguments?

sure @westonpace, thanks for the ping. I will update it.

@vibhatha vibhatha force-pushed the round_func branch 2 times, most recently from aa17207 to e2c9f4d Compare December 13, 2022 03:59
@vibhatha vibhatha requested a review from westonpace December 13, 2022 03:59
@westonpace
Copy link
Member

@vibhatha can you update this when you get a chance?

@vibhatha
Copy link
Contributor Author

@westonpace will update it tomorrow. Sorry about the delay.

@vibhatha
Copy link
Contributor Author

vibhatha commented Feb 7, 2023

@westonpace sorry about the long wait. I updated the PR, although I see branch is out-of-date with base branch. I have rebased and did --force push.

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. It still says out of date but a rebase should have fixed that. Did you make sure you had updated your local main branch before rebasing?

Comment on lines 56 to 58
to round it. For floating point numbers, it specifies the IEEE
754 rounding mode (as it does for all other floating point
operations). For integer types:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed this test below (in the fp64/i32 kernel). Can you also remove it here (in the fp32/i32 kernel)

extensions/functions_rounding.yaml Show resolved Hide resolved
@vibhatha
Copy link
Contributor Author

vibhatha commented Feb 7, 2023

Thanks for the update. It still says out of date but a rebase should have fixed that. Did you make sure you had updated your local main branch before rebasing?

main and round_func branches are rebased with the upstream/main. Not sure what is going on here?

@vibhatha
Copy link
Contributor Author

vibhatha commented Feb 8, 2023

@westonpace I think now the branch is in sync with the base branch.

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding integer kernels. I think the description will need to be slightly different for these kernels.

extensions/functions_rounding.yaml Outdated Show resolved Hide resolved
extensions/functions_rounding.yaml Outdated Show resolved Hide resolved
Comment on lines +58 to +70
- TIE_TO_EVEN: round to nearest value; if exactly halfway, tie
to the even option.
- TIE_AWAY_FROM_ZERO: round to nearest value; if exactly
halfway, tie away from zero.
- TRUNCATE: always round toward zero.
- CEILING: always round toward positive infinity.
- FLOOR: always round toward negative infinity.
- AWAY_FROM_ZERO: round negative values with FLOOR rule, round positive values with CEILING rule
- TIE_DOWN: round ties with FLOOR rule
- TIE_UP: round ties with CEILING rule
- TIE_TOWARDS_ZERO: round ties with TRUNCATE rule
- TIE_TO_ODD: round to nearest value; if exactly halfway, tie
to the odd option.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does TIE_TO_EVEN and TIE_TO_ODD mean for negative values of s?

CC @ianmcook

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not quite clear from this definition either: https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_round
🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I did some investigation. As an example, consider 15 when s=-1. The two choices are 10 and 20 which are both even. However, when we drop the trailing 0's so the choices are 1 and 2 and 2 is the even one so we get 20.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to clarify. I don't think we can easily do so without writing a novel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played with Mysql a bit, but with the round function, can we specify such options?

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I think this looks good. @EpsilonPrime did you want to take another look?

@vibhatha
Copy link
Contributor Author

vibhatha commented Feb 9, 2023

@westonpace a CI is failing and this happened a few times. Seems like unrelated, though not very sure about it.

Copy link
Member

@EpsilonPrime EpsilonPrime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@westonpace
Copy link
Member

@westonpace a CI is failing and this happened a few times. Seems like unrelated, though not very sure about it.

Unrelated to this PR. I created #439 which should address it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants