-
Notifications
You must be signed in to change notification settings - Fork 50
Improve treeHasher performance #42
Improve treeHasher performance #42
Conversation
cc @odeke-em |
Codecov Report
@@ Coverage Diff @@
## master #42 +/- ##
=======================================
Coverage 86.05% 86.05%
=======================================
Files 6 6
Lines 466 466
=======================================
Hits 401 401
Misses 37 37
Partials 28 28
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you @cuonglm! Let's also see if we can shave off by reusing the hashers
Yeah, I think it can, but want to keep the PR as small as possible. |
Absolutely, I meant to write/add "later on" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides a minor nit, looks good to me 👍 thanks for this optimisation!
da97295
to
6098408
Compare
By doing two things: - Pre-allocated enough size for slice, instead of initializing a small one then continuously appending to it. - Initializing placeholder slice once instead of creating new one everytime treeHasher.placeholder is called. That helps improves the sppeed, and less allocations for Update/Delete operations: name old time/op new time/op delta SparseMerkleTree_Update-8 11.9µs ± 3% 11.2µs ± 3% -5.89% (p=0.008 n=5+5) SparseMerkleTree_Delete-8 9.66µs ± 2% 5.40µs ± 2% -44.12% (p=0.008 n=5+5) name old alloc/op new alloc/op delta SparseMerkleTree_Update-8 17.7kB ± 0% 16.1kB ± 1% -9.42% (p=0.016 n=4+5) SparseMerkleTree_Delete-8 16.3kB ± 0% 13.9kB ± 0% -14.82% (p=0.008 n=5+5) name old allocs/op new allocs/op delta SparseMerkleTree_Update-8 117 ± 0% 63 ± 0% -46.15% (p=0.029 n=4+4) SparseMerkleTree_Delete-8 94.4 ± 1% 28.6 ± 2% -69.70% (p=0.008 n=5+5)
6098408
to
2066960
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!!
copy(value, leafPrefix) | ||
|
||
value := make([]byte, 0, len(leafPrefix)+len(path)+len(leafData)) | ||
value = append(value, leafPrefix...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy
should be faster than append
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robert-zaremba It should be the same if we pre-allocate the slice, which we did here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as I remember copy
is faster. Maybe something changed in the recent Go releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I remember it will literally call the same code under the hood if the slice is pre-allocated. Append is just slightly more idiomatic. Especially, if it is chained / composed of multiple appends. I'm happy to verify this with a benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robert-zaremba Here's a quick prove:
package main
import "testing"
var data = [512]int{}
func BenchmarkAppend(b *testing.B) {
for i := 0; i < b.N; i++ {
s := make([]int, 0, 512)
s = append(s, data[:]...)
}
}
func BenchmarkCopy(b *testing.B) {
for i := 0; i < b.N; i++ {
s := make([]int, 512)
copy(s, data[:])
}
}
The result:
goos: darwin
goarch: arm64
BenchmarkAppend-8 8270780 136.6 ns/op 0 B/op 0 allocs/op
BenchmarkCopy-8 8787560 136.6 ns/op 0 B/op 0 allocs/op
PASS
ok command-line-arguments 2.793s
The result can be varied between each run, but just several ns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very good. Thanks @cuonglm
By doing two things:
one then continuously appending to it.
current hasher size is greater than current one.
That helps improves the sppeed, and less allocations for Update/Delete
operations: