Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Andrewnguyen/caliper integrate Caliper, n #248

Merged
merged 10 commits into from
Jan 6, 2014
Merged

Andrewnguyen/caliper integrate Caliper, n #248

merged 10 commits into from
Jan 6, 2014

Conversation

andrewnguyen
Copy link
Contributor

Sort of a big PR:

  • Integrates caliper
  • Added a second experimental version of HyperLogLogMonoid that does the jRhoW calculation more efficiently (but less idiomatically perhaps?)
  • Added tests which confirm that the original HyperLogLogMonoid and the new version produce the same HLL structure using the apply method (perhaps we should be more thorough)
  • Added benchmarks comparing the run time of running batchCreate parameterized on the size of the input and on the number of bits used.
  • To run the benchmarks do 'sbt algebird-caliper/run'

Example run:

[info]         benchmark bits max        us linear runtime
[info]        WithBitSet    5  10    208.54 =
[info]        WithBitSet    5  20    436.65 =
[info]        WithBitSet    5  30    651.97 =
[info]        WithBitSet   10  10    266.32 =
[info]        WithBitSet   10  20    553.00 =
[info]        WithBitSet   10  30    779.33 =
[info]        WithBitSet   17  10   1751.38 =
[info]        WithBitSet   17  20   3531.91 =
[info]        WithBitSet   17  30   4993.15 =
[info]        WithBitSet   25  10 202083.50 ======
[info]        WithBitSet   25  20 595929.50 ====================
[info]        WithBitSet   25  30 893279.00 ==============================
[info] WithBitSetWrapper    5  10      6.86 =
[info] WithBitSetWrapper    5  20     12.87 =
[info] WithBitSetWrapper    5  30     17.77 =
[info] WithBitSetWrapper   10  10      5.06 =
[info] WithBitSetWrapper   10  20      8.04 =
[info] WithBitSetWrapper   10  30     12.55 =
[info] WithBitSetWrapper   17  10      3.96 =
[info] WithBitSetWrapper   17  20      8.16 =
[info] WithBitSetWrapper   17  30     12.93 =
[info] WithBitSetWrapper   25  10      4.23 =
[info] WithBitSetWrapper   25  20      8.40 =
[info] WithBitSetWrapper   25  30     13.56 =

NOTE: I took a few bits of code from https://github.com/sirthias/scala-benchmarking-template. We should wait on merging in until the licensing situation is clear. Also, it'd be nice if someone can double check I have all the necessary attributions.

@@ -385,6 +386,72 @@ class HyperLogLogMonoid(val bits : Int) extends Monoid[HLL] {
}
}

class HyperLogLogMonoid2(bits: Int) extends HyperLogLogMonoid(bits) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scaladoc public classes and methods? Especially "j"...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"j" shouldn't really be public since it's pretty much internal to the workings of HyperLogLogs but sure.

@@ -56,6 +73,49 @@ object HyperLogLog {

def twopow(i : Int) : Double = scala.math.pow(2.0, i)

/** the value 'j' is equal to <w_0, w_1 ... w_(bits-1)> */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a TODO: we could read a byte at a time while bits >= 8, and that would be faster.

@johnynek
Copy link
Collaborator

johnynek commented Jan 4, 2014

SUPER EXCITED! Nice work! Way to find a bottleneck and hammer it down!

johnynek added a commit that referenced this pull request Jan 6, 2014
@johnynek johnynek merged commit dab7720 into twitter:develop Jan 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants