-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: New quantile interpolation method & QUANTILE_DISC function in SQL #19139
feat: New quantile interpolation method & QUANTILE_DISC function in SQL #19139
Conversation
cc @alexander-beedie, let's continue here |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #19139 +/- ##
==========================================
+ Coverage 79.99% 80.01% +0.01%
==========================================
Files 1527 1527
Lines 209203 209138 -65
Branches 2415 2415
==========================================
- Hits 167352 167334 -18
+ Misses 41303 41256 -47
Partials 548 548 ☔ View full report in Codecov by Sentry. |
Alright, correct me if I'm wrong, but (modulo implementation bugs), I believe our current interpolation methods all share the same properties:
In this context our interpolation methods can be visualized as such, as an example for This PR proposes to add another interpolation method which breaks the above convention. It does not distribute the elements among the number line [0, 1] starting at 0 and ending at 1, instead it chops up the number line [0, 1] into equal parts, assigning one value to each part in order: Not shown is what to do about the boundaries, my preference would be left-closed if we had to choose. Is my understanding correct? If so I have two issues with it:
|
Yes, you're exactly right. I agree that
Renaming to "Method" seems good enough (current discrete options like |
As long as it's just on the Rust side for now we don't have to worry about the renaming of |
@orlp I've renamed the |
Is anything else needed here from me? |
@pomo-mondreganto Looks good to me, @alexander-beedie can you review the SQL bits? Then we can merge this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me too; the SQL integration is sound, and the results tie out with other implementations. Very nice addition👌
This is a followup of #18047. As suggested in the comment there, the current quantile interpolation methods are lacking, so this PR adds a new interpolation method that works as described here and uses the new method in
QUANTILE_DISC
SQL function. Conformance tests with DuckDB are included.