Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Ultra fast modified Greenwald-khanna 💨 #26

Open
sitegui opened this issue Nov 19, 2019 · 1 comment
Open

[Proposal] Ultra fast modified Greenwald-khanna 💨 #26

sitegui opened this issue Nov 19, 2019 · 1 comment

Comments

@sitegui
Copy link
Contributor

sitegui commented Nov 19, 2019

Hello,

I've been cooking up an algorithm inspired by GK, but built for speed. It currently achieves 11.4 million elements per second in single-thread, a 45x speedup from the number cited on this repo's README. With 8 threads it goes up to 77.0M/s (see more results here)

Initially I though about releasing it as a separate crate, but my question for you maintainers/users of this lib is whether you believe this one more quantile implementation could be added to this repo? The upside is to reduce the fragmentation of quantile crates at the cost of bloating this one lib.

What do you think?

Note: in some close future, I'd want to write up the main rationale of my approach somewhere. But it boils down mainly to:

  1. micro-compressions: do not insert at every sample, discard them earlier instead of issuing a full compression with a regular interval as GK originally does
  2. B-tree structure instead of Vec: to speed up insertion on arbitrary positions

The memory bound is 5 / epsilon samples.

Best regards,

@blt
Copy link
Collaborator

blt commented Nov 20, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants