Speed up HLL presentation by 100x #491

avibryant · 2015-09-09T23:18:38Z

This changes the use of math.pow(2.0,-mj) when computing the estimates in SparseHLL and DenseHLL to 1.0 / (1 << mj).

To benchmark the difference this makes, I first had to change the HLLPresentBenchmark to clone the HLL instances before asking for their approximateSize, otherwise the estimate gets cached as a lazy val and the benchmarks are meaningless.

Having done that, you can see a dramatic difference. Because I'm impatient I have so far only run this with a single set of parameters, where it led to a 40x speedup.

Before:

[info] Benchmark                            (bits)  (max)  (numHLL)   Mode  Cnt    Score   Error  Units
[info] HLLPresentBenchmark.timeBatchCreate      12  10000        10  thrpt  200  141.569 ± 5.010  ops/s

After:

[info] Benchmark                            (bits)  (max)  (numHLL)   Mode  Cnt     Score    Error  Units
[info] HLLPresentBenchmark.timeBatchCreate      12  10000        10  thrpt  200  5984.351 ± 60.770  ops/s

avibryant · 2015-09-10T00:06:47Z

This is failing on the HLL downsize test (which passes on my machine). Is this a known-flaky test, or is this maybe pointing to some actual issue?

ianoc · 2015-09-10T00:22:48Z

The two builds actually failed on two different tests, one of them was that, the min hasher one too. Both are alas flaky tests. Restarted both

ianoc · 2015-09-10T00:23:27Z

LGTM though, merge when green

jvns · 2015-09-10T01:10:51Z

I just tested with @kamalmarhubi using a lookup table as well as (1/ (1 << mj)), and the lookup table is twice as fast, which seems worthwhile to me. (val lut: Array[Double] = (0 until 256).map(x => 1.0 / (1 << x)).toArray). Shouldn't block this though!

jnievelt · 2015-09-10T01:30:48Z

This breaks when mj > 30. If we're not concerned about that, can we at least add an assertion so we fail instead of giving the wrong result?

scala> 1.0 / (1 << 31)
res4: Double = -4.6566128730773926E-10

scala> 1.0 / (1 << 32)
res5: Double = 1.0

avibryant · 2015-09-10T02:27:59Z

@jnievelt in principle it can be up to 124, though that's exceedingly unlikely. Still, @jvns's suggestion of a lookup table would take care of this and is faster anyway, so I'll revise to take that approach.

avibryant · 2015-09-10T02:44:43Z

Ok, yeah, with a lookup table (pushed) it's over 100x as fast:

[info] Benchmark                            (bits)  (max)  (numHLL)   Mode  Cnt      Score     Error  Units
[info] HLLPresentBenchmark.timeBatchCreate      12  10000        10  thrpt  200  14693.743 ± 206.868  ops/s

jnievelt · 2015-09-10T02:58:55Z

algebird-core/src/main/scala/com/twitter/algebird/HyperLogLog.scala

@@ -43,6 +43,8 @@ object HyperLogLog {
  /* Size of the hash in bits */
  val hashSize = 128

+  val powersOfNegativeTwo: Array[Double] = 0.until(hashSize).map{ i => math.pow(2.0, -i) }.toArray


to instead of until?

I think it's the rhoW that's taken here? On inspection the code seems a little not right (I could be missing something), but it seems like this is the count of the number of leading 0s, so it should include hashSize for the case that the hash is all 0s.

You're always setting aside at least 4 bits for the bucket, so I think the max is actually 124 - this should be slightly more than needed.

Sorry looks like we raced on actions there. Can you have hashSize itself rather than hashSize -1? the tests are green, so it seems unlikely this is wrong per say right? or do we have a test missing?

Cool. For the record the parts that look not right are in rhoW:

terminal condition is that contains returns false, but contains doesn't do any bounds checking; is it possible to overrun the array, however unlikely?

loop starts at 1 instead of 0. with even a small aggregation I imagine this doesn't matter, but isn't it technically incorrect?

Again I could be missing something, but if you think they might be worth addressing maybe lets file issues.

this should be negativePowersOfTwo?

Hm I'm confused as to what's being asked here.

@ianoc I'm quite certain we don't need more than hashSize-1
@jnievelt IIRC the definition of rhoW is "index of first non-zero", not "number of zeros" so that might explain starting at 1?
@kamalmarhubi argh, good point

Speed up HLL presentation by 100x

johnynek · 2015-09-10T18:44:35Z

If/when we clean this up to fix the name, can will also make the table private[this]?

kevosauce · 2015-09-10T21:59:00Z

algebird-core/src/main/scala/com/twitter/algebird/HyperLogLog.scala

@@ -447,7 +449,7 @@ case class DenseHLL(bits: Int, v: Bytes) extends HLL {
        count += 1
        res += 1.0
      } else {
-        res += java.lang.Math.pow(2.0, -mj)
+        res += HyperLogLog.powersOfNegativeTwo(mj)


should this be negativePowersOfTwo, not powersOfNegativeTwo? the current name implies we are doing (-2)^(mj) which is not the same as (2)^(-mj)

zg · 2015-09-11T12:46:03Z

Given the performance boost could you not replace all instances of Math.pow(2, X) with X << 1 across the entire code base?

avibryant · 2015-09-11T15:24:34Z

@johnynek it's used in the HLL subclasses but defined in the HyperLogLog object, so private[this] won't work. We could do private[algebird]?

ianoc · 2015-09-11T16:00:30Z

yeah, private[algebird] is good, once its classed as an implementation detail outside the project so we can change it later

jnievelt · 2015-09-11T16:43:59Z

@zg not, for two reasons:

math.pow allows non-integral exponents (which we sometimes use), which we can't use with 1 << X
per my comment above it would only work if we're sure that X <= 30. And per the results here, we'd probably be better off using a lookup in that case anyway

eigenvariable · 2016-02-04T00:50:43Z

Do we have an ETA on when a version of algebird with this optimization will be released?

edit: I'm told this was just released yesterday.

ianoc · 2016-02-04T00:51:47Z

@eigenvariable Latest algebird was released only 2 days go. Check out the releases section for more info.

avi-stripe added 2 commits September 9, 2015 16:13

change the HLLPresent benchmark to clone to avoid caching

4062ced

use shifts instead of pow to compute HLL estimates

9ea5f8e

use a lookup table to compute powers of negative two

8d406ea

avibryant changed the title ~~Speed up HLL presentation by 40x~~ Speed up HLL presentation by 100x Sep 10, 2015

jnievelt reviewed Sep 10, 2015
View reviewed changes

ianoc added a commit that referenced this pull request Sep 10, 2015

Merge pull request #491 from avibryant/avi-faster-hll

1387e6c

Speed up HLL presentation by 100x

ianoc merged commit 1387e6c into twitter:develop Sep 10, 2015

kevosauce reviewed Sep 10, 2015
View reviewed changes

avibryant mentioned this pull request Sep 11, 2015

fix name and visibility of negativePowersOfTwo #492

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up HLL presentation by 100x #491

Speed up HLL presentation by 100x #491

avibryant commented Sep 9, 2015

avibryant commented Sep 10, 2015

ianoc commented Sep 10, 2015

ianoc commented Sep 10, 2015

jvns commented Sep 10, 2015

jnievelt commented Sep 10, 2015

avibryant commented Sep 10, 2015

avibryant commented Sep 10, 2015

jnievelt Sep 10, 2015

avibryant Sep 10, 2015

ianoc Sep 10, 2015

jnievelt Sep 10, 2015

kamalmarhubi Sep 10, 2015

avibryant Sep 10, 2015

johnynek commented Sep 10, 2015

kevosauce Sep 10, 2015

zg commented Sep 11, 2015

avibryant commented Sep 11, 2015

ianoc commented Sep 11, 2015

jnievelt commented Sep 11, 2015

eigenvariable commented Feb 4, 2016

ianoc commented Feb 4, 2016

Speed up HLL presentation by 100x #491

Speed up HLL presentation by 100x #491

Conversation

avibryant commented Sep 9, 2015

avibryant commented Sep 10, 2015

ianoc commented Sep 10, 2015

ianoc commented Sep 10, 2015

jvns commented Sep 10, 2015

jnievelt commented Sep 10, 2015

avibryant commented Sep 10, 2015

avibryant commented Sep 10, 2015

jnievelt Sep 10, 2015

Choose a reason for hiding this comment

avibryant Sep 10, 2015

Choose a reason for hiding this comment

ianoc Sep 10, 2015

Choose a reason for hiding this comment

jnievelt Sep 10, 2015

Choose a reason for hiding this comment

kamalmarhubi Sep 10, 2015

Choose a reason for hiding this comment

avibryant Sep 10, 2015

Choose a reason for hiding this comment

johnynek commented Sep 10, 2015

kevosauce Sep 10, 2015

Choose a reason for hiding this comment

zg commented Sep 11, 2015

avibryant commented Sep 11, 2015

ianoc commented Sep 11, 2015

jnievelt commented Sep 11, 2015

eigenvariable commented Feb 4, 2016

ianoc commented Feb 4, 2016