Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database cachednode callback walk #20529

Merged
merged 4 commits into from
Jan 17, 2020
Merged

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Jan 8, 2020

Replaces #20527 , this one does not contain all the other experimental fluff.

In the trie database, there are several operations that recursively walk children of nodes, and collect them into lists of hashes. This means that on the leaf level, all the upper levels hold N lists with hashes. So at root level, the list contains all leaf hashes, at level below, a list with 1/16th of the entries, and so on.

This PR changes it to walk the nodes with a callback instead. Needs to be benchmarked and tested IRL, of course.

before

goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/trie
BenchmarkDerefRootFixedSize/10-6                   61821             19229 ns/op           14609 B/op         38 allocs/op
BenchmarkDerefRootFixedSize/100-6                  20331             58074 ns/op           67856 B/op        142 allocs/op
BenchmarkDerefRootFixedSize/1K-6                    1717            636877 ns/op          696097 B/op       1369 allocs/op
BenchmarkDerefRootFixedSize/10K-6                    145           8177885 ns/op         7116049 B/op      13908 allocs/op
BenchmarkDerefRootFixedSize/100K-6                    13          85116617 ns/op        70853392 B/op     138395 allocs/op

after

goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/trie
BenchmarkDerefRootFixedSize/10-6                   76370             16321 ns/op             785 B/op         11 allocs/op
BenchmarkDerefRootFixedSize/100-6                  24958             47222 ns/op             784 B/op         11 allocs/op
BenchmarkDerefRootFixedSize/1K-6                    2838            419798 ns/op             784 B/op         11 allocs/op
BenchmarkDerefRootFixedSize/10K-6                    202           5425507 ns/op             784 B/op         11 allocs/op
BenchmarkDerefRootFixedSize/100K-6                    20          55343687 ns/op             784 B/op         11 allocs/op

@holiman holiman changed the title Reference fix Database cachednode callback walk Jan 8, 2020
@karalabe
Copy link
Member

karalabe commented Jan 9, 2020

It only gathers the direct children of a node, not recursively the entire subtrie.

@holiman
Copy link
Contributor Author

holiman commented Jan 9, 2020

It only gathers the direct children of a node, not recursively the entire subtrie.

You are correct. I suspect the problem is that we did a children := make([]common.Hash, 0, 16) at every junction, even though many don't even have children, let alone 16. So it causes a lot of tiny mallocs that were not even used.

https://github.com/ethereum/go-ethereum/pull/20529/files#diff-e49891341ce9d49a244d9ead8df3d072L186

@holiman
Copy link
Contributor Author

holiman commented Jan 9, 2020

Apparently this code doesn't work, though. It changes the semantics, as it exceutes the callback immediately, instead of collecting the children first, so it breaks things.

A bug was in there, fixed now hopefully

@holiman
Copy link
Contributor Author

holiman commented Jan 9, 2020

Giving this a shot on the benchmarkers, replacing the current run (https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&var-exp=mon08&var-master=mon09&var-percentile=50&from=1578482122263&to=now ) without restarting them. The experimental (mon08) was at 5.021M when it was replaced.

@holiman
Copy link
Contributor Author

holiman commented Jan 9, 2020

Ah, immediate crash on the gpo thingy.

@holiman
Copy link
Contributor Author

holiman commented Jan 10, 2020

This actually does seem to make a difference. After running for a couple of hours, I checked the same section of blocks for both of these.
First is the mon09 (master) between 5.1M - 5.2M:

Screenshot_2020-01-10 Dual Geth - Grafana

Although absolute time-comparisons between machines is a bit flakey, we can see that commit averages 13.16ms, above account commit at 11.62ms.

For this pr, mon08, this is the corresponding chart:
Screenshot_2020-01-10 Dual Geth - Grafana(1)

In this chart, account commit is at 11.37ms (which is very close, even in absolute numbers, to 11.62ms), whereas commit has dropped to third place at 11.13ms.

TLDR; this PR should (if anything) improve commit, and fwict it does cut down the average commit time from ~13.16ms to ~11.13ms in the section of blocks I checked.

@karalabe karalabe added this to the 1.9.10 milestone Jan 17, 2020
Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@karalabe karalabe merged commit 9b09c0f into ethereum:master Jan 17, 2020
enriquefynn pushed a commit to enriquefynn/go-ethereum that referenced this pull request Mar 10, 2021
…ereum#20529)

* trie/tests: add benchmarks and update trie tests

* trie: update benchmark tests

* trie: utilize callbacks instead of amassing lists of hashes in database ref/unref

* trie: replace remaining non-callback based accesses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants