Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize algorithm links per language in README #14

Open
scotts opened this issue Jul 6, 2020 · 9 comments
Open

Generalize algorithm links per language in README #14

scotts opened this issue Jul 6, 2020 · 9 comments
Labels
documentation Improvements or additions to documentation

Comments

@scotts
Copy link
Collaborator

scotts commented Jul 6, 2020

In the current README, when we mention a particular algorithm, we link to its C++ implementation. This does not scale as we add implementations in new languages. We should come up with a format that makes it easier to easily point readers to implementations in a language they are interested in.

@scotts scotts added the documentation Improvements or additions to documentation label Jul 6, 2020
@segeljakt
Copy link
Contributor

segeljakt commented Jul 7, 2020

Maybe a capability matrix like https://beam.apache.org/documentation/runners/capability-matrix/ could be useful. I'm not sure how easy it would be to write it in markdown though.

I think it would also be very interesting to have a benchmark between the implementations.

@scotts
Copy link
Collaborator Author

scotts commented Jul 10, 2020

I think a capability matrix is a useful way to categorize the algorithms, as we have several dimensions we care about (operations supported; in-order required; algorithmic analysis; space required; implementations). I'm worried, thought, that is a lot to parse up-front. It might be easier to maintain and read it in a more prose-like style rather than in one large table. Maybe something like (these are all bogus URLs):

DABA

full name: De-Amortized Banker's Aggregator
ordering: FIFO
operator requirements: associativity
time complexity: worst-case O(1)
space requirements: 2n
first appeared: Low-Latency Sliding-Window Aggregation in Worst-Case Constant Time
implementions: C++, Rust

FiBA

full name: Finger B-Tree Aggregator
ordering: out-of-order allowed, requires timestamps
operator requirements: associativity
time complexity: worst-case O(log n d) where d is distance from being in-order; reduces to worst-case O(log n) when d=0; and average-case O(log d) and reduces to average-case O(1) when d=0
space requirements: O(n)
first appeared: Optimal and General Out-of-Order Sliding-Window Aggregation
implementions: C++, Rust, Java

@scotts
Copy link
Collaborator Author

scotts commented Jul 10, 2020

Also, I think what to do about benchmarks warrants a new issue.

scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 10, 2020
scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 10, 2020
scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 10, 2020
scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 10, 2020
scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 10, 2020
scotts added a commit that referenced this issue Aug 10, 2020
Expand and create systematic listing of algorithms #14
@hirzel
Copy link
Member

hirzel commented Aug 11, 2020

This looks great so far! A few tweaks:

  • Instead of citing Jon Skeet on Stack Overflow we should cite adamax on Stack Overflow. The former only presents a LIFO stack, whereas the latter presents a FIFO queue.

  • Instead of saying FiBA requires space n, I would say it requires space O(n). The exact constant factor depends on the arity.

  • For SOE, we can say the time is worst-case O(1). This is of course predicated on the combine function being O(1), but we also assume that elsewhere. Furthermore, it is predicated on using a chunked-array deque as the underlying backing store.

@hirzel
Copy link
Member

hirzel commented Aug 11, 2020

One more thing I just noticed. We are saying "out-of-order allowed" for Recalc and for SOE. However, I think we currently only provide FIFO implementations for those two algorithms.

@scotts
Copy link
Collaborator Author

scotts commented Aug 11, 2020

However, I think we currently only provide FIFO implementations for those two algorithms.

I thought about that, too. I think maybe we should make the qualification when we link to the implementation and keep the SWAG itself general.

@ktangwon
Copy link
Collaborator

This is looking good. Right now, the README file uses the terms average-case and worst-case to describe algorithms' running time. However, the term "amortized" (e.g., amortized O(log n) and amortized O(1)) would be a more precise characterization of the notion of averaging we use.

scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 13, 2020
scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 13, 2020
scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 13, 2020
scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 13, 2020
scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 13, 2020
@scotts
Copy link
Collaborator Author

scotts commented Aug 13, 2020

@ktangwon, what did we say the space requirements are for IOA?

@ktangwon
Copy link
Collaborator

@ktangwon, what did we say the space requirements are for IOA?

IOA keeps 2n partial aggregates (same as DABA) and additionally keeps a few extra pointers (between nodes and for suspended computations). It is hard to pin down the exact constant, but we can safely say IOA requires O(n) space.

scotts pushed a commit to scotts/sliding-window-aggregators that referenced this issue Aug 14, 2020
scotts added a commit that referenced this issue Aug 14, 2020
Add IOA's space requirements #14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants