Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize Generated{Abstract,Product}Algebra with benchmarking #591

Merged
merged 3 commits into from
Dec 2, 2016

Conversation

sritchie
Copy link
Collaborator

@sritchie sritchie commented Dec 1, 2016

This PR

  • Adds a benchmark for Tuple4 (seemed like a nice one to pick :)
  • corrects some bad performance in the generated sumOption implementations that this benchmark uncovered.

Run the benchmark with:

sbt algebird-benchmark/jmh:run -t1 -f 1 -wi 5 -i 15 .*Tuple4Benchmark.*

Benchmark Results

Before the changes:

[info] Benchmark                             (numElements)   Mode  Cnt     Score     Error  Units
[info] Tuple4Benchmark.timeProductPlus               10000  thrpt   15  1870.581 ± 149.159  ops/s
[info] Tuple4Benchmark.timeProductSumOption          10000  thrpt   15   492.128 ±  29.016  ops/s
[info] Tuple4Benchmark.timeTuplePlus                 10000  thrpt   15  3911.323 ± 106.611  ops/s
[info] Tuple4Benchmark.timeTupleSumOption            10000  thrpt   15  2682.713 ±  39.312  ops/s

After the changes:

[info] Benchmark                             (numElements)   Mode  Cnt     Score     Error  Units
[info] Tuple4Benchmark.timeProductPlus               10000  thrpt   15  1934.965 ±  50.165  ops/s
[info] Tuple4Benchmark.timeProductSumOption          10000  thrpt   15  2896.808 ±  71.128  ops/s
[info] Tuple4Benchmark.timeTuplePlus                 10000  thrpt   15  3893.420 ± 139.511  ops/s
[info] Tuple4Benchmark.timeTupleSumOption            10000  thrpt   15  4073.248 ±  60.992  ops/s
[success] Total time: 88 s, completed Dec 1, 2016 12:48:43 PM

so ~6x improvement for the ProductN semigroups, 1.5x improvement for the tupleN semigroups. AND, more importantly, they're all more efficient than just using plus.

Note that benchmarks only adds longs, so it measures the improvement of sumOption on tuples when the tuple elements don't have an efficient sumOption implementation. When they do, these sumOption implementations will be that much faster.

@@ -166,6 +166,7 @@ lazy val algebird = Project(
base = file("."),
settings = sharedSettings)
.settings(noPublishSettings)
.settings(coverageExcludedPackages := "<empty>;.*\\.benchmark\\..*")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotta increase the coverage!

@sritchie sritchie force-pushed the sritchie/tuple_performance branch from 1122d43 to 00a457d Compare December 1, 2016 20:02
@codecov-io
Copy link

codecov-io commented Dec 1, 2016

Current coverage is 81.03% (diff: 100%)

Merging #591 into develop will increase coverage by 6.85%

@@            develop       #591   diff @@
==========================================
  Files           122        113     -9   
  Lines          4694       4903   +209   
  Methods        4268       4517   +249   
  Messages          0          0          
  Branches        387        385     -2   
==========================================
+ Hits           3482       3973   +491   
+ Misses         1212        930   -282   
  Partials          0          0          

Powered by Codecov. Last update 87cbf25...83cb518

@@ -216,6 +216,7 @@ lazy val algebirdTest = module("test").settings(
).dependsOn(algebirdCore)

lazy val algebirdBenchmark = module("benchmark").settings(JmhPlugin.projectSettings:_*).settings(
coverageExcludedPackages := "com\\.twitter\\.algebird\\.benchmark.*",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is hurting our coverage!

* the `sumOption` implementation of the supplied Semigroup[T]
*/
def fromSumOption[T](size: Int)(implicit sg: Semigroup[T]) =
new ArrayBufferedOperation[T, T](1000) with BufferedReduce[T] {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't you ignoring size?

* Returns an ArrayBufferedOperation instance that internally uses
* the `sumOption` implementation of the supplied Semigroup[T]
*/
def fromSumOption[T](size: Int)(implicit sg: Semigroup[T]) =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put a narrow return type on this? Maybe BufferedReduce[T]?

@sritchie
Copy link
Collaborator Author

sritchie commented Dec 1, 2016

@johnynek ah, nice catches. Was letting the coffee get to me. All fixed up.

@isnotinvain
Copy link
Contributor

LGTM

Any idea why ignoring size didn't trigger a test failure? Is that just a size hint for performance?

Does this now always use sumOption, even when plus would normally be called? I wonder if any plus(x,y) implementations are more efficient than sumOption(Seq(x,y)) in the case of only adding 2 items together? (eg, map monoid's sumOption creating a whole mutable hash map in sumOption)

@sritchie-stripe
Copy link
Contributor

@isnotinvain, my mistake was that I left a value of 1000 hardcoded, instead of respecting the size parameter. The Tuple and Record sumOption implementations also pass a value of 1000, so there was no difference.

All this PR does is add a more efficient implementation for anything that explicitly calls sumOption - the plus logic still works exactly like it did before. In fact, the benchmark calls out the difference between using plus explicitly vs sumOption for larger collections.

If you call sumOption on a collection of two things, it's true that it might be more efficient to just add them together with plus - sumOption operates on TraversableOnce, though, so it can't know that information. It's up to the caller to know the difference.

Thanks for the LGTM! Tests timed out, so waiting for the build to complete again and then I'll merge and continue with the averaged value implementation.

Copy link
Contributor

@isnotinvain isnotinvain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining!

@isnotinvain isnotinvain merged commit 0ff8bfe into develop Dec 2, 2016
@johnynek
Copy link
Collaborator

johnynek commented Dec 2, 2016

so, I'm worried about this.

On a deep tuple are we exponentially growing buffers? Previously we were not (I don't think). You had a linear number of buffers. Can we stop and document why this is not exponential here?

@isnotinvain
Copy link
Contributor

Sorry, looks like I pulled the trigger too soon. Did this go into the release?
Do we need to back it out :( ?

@sritchie-stripe
Copy link
Contributor

hey, sorry @isnotinvain for not updating the ticket! @johnynek and I talked offline when releasing and decided that this is NOT an issue. Buffers are going to grow linearly with the depth if you have nested tuples, which is fine.

If I have a tuple2 of tuple2s, The other tuple 2 will create 2 buffers, one for each inner tuple - each inner tuple will then create 2 buffers as well. I think this is totally fine, and how it has to work. Does that sound right?

Glad we thought this through!

@isnotinvain
Copy link
Contributor

Yeah SGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants