Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Christmas trie #20481

Merged
merged 16 commits into from
Feb 3, 2020
Merged

Christmas trie #20481

merged 16 commits into from
Feb 3, 2020

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Dec 23, 2019

Dedicated hasher/committer

The previous hasher.go, was a generic implementation which was used by both Commit and Hash. It does a
couple of things,

  • Walk the un-hashed / dirty trie towards the leafs,
  • Hash the tree on the return path,
  • Create one new 'shadow'-trie containing the new hashes, cached, which will replace the old root,
  • Create one new collapsed-trie containing the hashing-preimages, which is incrementally used to create higher order hash values.
  • Call store, if commit is being performed, to send nodes to the database, and also to the onleaf handler (for Commit)

This means that for every Hash or Commit, every processed node was duplicated twice, using copy. When I split this in two, it was possible
to remove a lot of redundant allocs.

Commit

The Commit operation is basically read-only, with the exception of setting the dirty-flag to false.

  • This means that we can skip the whole 'create-new-shadow-tree-for-new-root' creation, and cut allocs in half.
  • We very seldon need to do any hashing in Commit (but sometimes still do), we can skip rlp-encoding most of the times.
  • The database insertion path needs the size of a node for tracking purposes, but that can be estimated instead
    of calculated strictly.
  • Sending items to onleaf and db has been split off into a separate thread, and instead this PR send those tasks via a channel.
name                             old time/op    new time/op    delta
CommitAfterHash/no-onleaf-6        4.95µs ± 9%    3.54µs ±19%  -28.39%  (p=0.000 n=10+9)
CommitAfterHash/with-onleaf-6      6.07µs ±18%    4.61µs ±21%  -24.07%  (p=0.000 n=10+10)
CommitAfterHashFixedSize/10-6      65.7µs ± 3%    34.0µs ± 5%  -48.23%  (p=0.000 n=9+10)
CommitAfterHashFixedSize/100-6      277µs ± 2%     157µs ± 7%  -43.21%  (p=0.000 n=9+10)
CommitAfterHashFixedSize/1K-6      3.29ms ±14%    1.68ms ±12%  -49.05%  (p=0.000 n=10+9)
CommitAfterHashFixedSize/10K-6     45.1ms ± 8%    26.8ms ±13%  -40.55%  (p=0.000 n=10+10)
CommitAfterHashFixedSize/100K-6     505ms ±15%     343ms ± 9%  -32.21%  (p=0.000 n=10+10)

name                             old alloc/op   new alloc/op   delta
CommitAfterHash/no-onleaf-6        1.79kB ± 1%    1.47kB ± 4%  -17.77%  (p=0.000 n=9+10)
CommitAfterHash/with-onleaf-6      1.98kB ± 2%    1.75kB ± 2%  -11.53%  (p=0.000 n=9+10)
CommitAfterHashFixedSize/10-6      37.5kB ± 0%    30.8kB ± 0%  -17.76%  (p=0.000 n=9+10)
CommitAfterHashFixedSize/100-6      179kB ± 0%     148kB ± 0%  -17.35%  (p=0.000 n=10+9)
CommitAfterHashFixedSize/1K-6      1.83MB ± 0%    1.49MB ± 0%  -18.41%  (p=0.000 n=10+10)
CommitAfterHashFixedSize/10K-6     19.9MB ± 0%    16.4MB ± 0%  -17.71%  (p=0.000 n=10+10)
CommitAfterHashFixedSize/100K-6     192MB ± 0%     157MB ± 0%  -18.19%  (p=0.000 n=10+10)

name                             old allocs/op  new allocs/op  delta
CommitAfterHash/no-onleaf-6          11.0 ± 0%       9.0 ± 0%  -18.18%  (p=0.000 n=10+10)
CommitAfterHash/with-onleaf-6        15.0 ± 0%      14.0 ± 0%   -6.67%  (p=0.002 n=8+10)
CommitAfterHashFixedSize/10-6         240 ± 0%       188 ± 0%  -21.67%  (p=0.000 n=10+10)
CommitAfterHashFixedSize/100-6      1.16k ± 0%     0.90k ± 0%  -22.45%  (p=0.000 n=10+10)
CommitAfterHashFixedSize/1K-6       11.9k ± 0%      9.2k ± 0%  -22.80%  (p=0.001 n=8+9)
CommitAfterHashFixedSize/10K-6       122k ± 0%       94k ± 0%  -22.82%  (p=0.000 n=10+10)
CommitAfterHashFixedSize/100K-6     1.21M ± 0%     0.94M ± 0%  -22.82%  (p=0.000 n=10+10)

Hash

The Hash operation has been optimized mainly to decrease the number of allocations.

name                  old time/op    new time/op    delta
Hash-6                  7.08µs ±12%    7.34µs ±14%     ~     (p=0.481 n=10+10)
HashFixedSize/10-6      70.0µs ± 2%    68.0µs ± 4%   -2.85%  (p=0.011 n=10+10)
HashFixedSize/100-6      328µs ± 3%     315µs ± 3%   -3.88%  (p=0.000 n=10+10)
HashFixedSize/1K-6      3.67ms ± 3%    3.54ms ±12%   -3.43%  (p=0.043 n=9+10)
HashFixedSize/10K-6     43.3ms ± 2%    40.9ms ± 3%   -5.46%  (p=0.000 n=9+8)
HashFixedSize/100K-6     525ms ±12%     465ms ±16%  -11.29%  (p=0.002 n=10+10)

name                  old alloc/op   new alloc/op   delta
Hash-6                    989B ± 1%      916B ± 2%   -7.36%  (p=0.000 n=10+10)
HashFixedSize/10-6      13.8kB ± 0%    12.5kB ± 0%   -9.26%  (p=0.000 n=10+10)
HashFixedSize/100-6     65.0kB ± 0%    58.7kB ± 0%   -9.83%  (p=0.000 n=10+10)
HashFixedSize/1K-6       695kB ± 0%     631kB ± 0%   -9.21%  (p=0.000 n=10+9)
HashFixedSize/10K-6     7.21MB ± 0%    6.57MB ± 0%   -8.88%  (p=0.000 n=10+10)
HashFixedSize/100K-6    71.3MB ± 0%    64.9MB ± 0%   -8.97%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
Hash-6                    12.0 ± 0%      11.0 ± 0%   -8.33%  (p=0.000 n=10+10)
HashFixedSize/10-6         182 ± 0%       162 ± 0%  -10.99%  (p=0.000 n=10+10)
HashFixedSize/100-6        886 ± 0%       786 ± 0%  -11.29%  (p=0.000 n=10+10)
HashFixedSize/1K-6       9.16k ± 0%     8.15k ± 0%  -10.97%  (p=0.001 n=8+9)
HashFixedSize/10K-6      93.5k ± 0%     83.4k ± 0%  -10.81%  (p=0.000 n=10+10)
HashFixedSize/100K-6      932k ± 0%      830k ± 0%  -10.87%  (p=0.000 n=10+10)

Benchmark tests

The trie benchmark tests were not accurate. The trie benchmarks used b.N to set the number of leafs in the trie. However,
the correlation between 'work performed' and 'leafs' is not linear, so this caused the benchmark results to be
incomparable. On a faster codebase, the test framework will increase the N factor, which will increase the work
factor by N * f(N), and depending on whatever final N the go test framework decides to stop at, it might show a
slower average (N / time) than the slower codebase.

I added new Fixed benchmarks, which benchmarks the times to Commit/Hash fixed-sized tries instead.

Further improvements

There are a couple of further improvements that can be made:

  1. Paralellized hashing: by splitting out the trie-walking on the first fullnode, we can have up to 16 goroutines that walk
    each subpath.

This should be an improvement, but only for tries of suitable size (perhaps 1000 new leafs or so). Most likely (imo), is that the old attempt had too
much overhead on the storage trie hashes, since the storage tries are too small to gain any real benefit from the change.

It might be a good idea to add an updatecounter to the trie, and only use paralell hashing when the number of updates (estimated) is above a certain threshold.

  1. Dedicated database inserter goroutine. This PR spins up a goroutine for db insertion at every trie commit. Although it's fairly fast, it's likely not a speed boost
    for the storage tries, for the same reason as above. So a better approach might be to have one dedicated (per trie, or per database) where we can send data from the committer
    to the database+onleaf callbacks. A separate benefit from doing this, is that during statedb.go commit, we don't have to wait for each individual trie to finish before
    starting to process the next one. Pipelining ftw!

I actually did an implementation of this, but it got kind of messy since my poc exposed the node and caused cyclic dependencies, so I've held it off so far, until I figure out
how to best structure it.

Real life tests

I'm doing a full-sync on mon08/mon09, it appears that htis PR is significantly faster than master.

https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&var-exp=mon08&var-master=mon09&var-percentile=50&from=1577128018418&to=now

@holiman
Copy link
Contributor Author

holiman commented Dec 23, 2019

Here's the cut-off when switching from #20465 to this PR, showing the times for commit and hash ops.
Screenshot_2019-12-23 Dual Geth - Grafana

Something is not right, though, this PR seem to have somehow broken the mem gc:
Screenshot_2019-12-23 Dual Geth - Grafana(1)

Which also shows in the leveldb stats:
Screenshot_2019-12-23 Dual Geth - Grafana(2)

@holiman
Copy link
Contributor Author

holiman commented Dec 24, 2019

It's doing pretty well. Since last night, when I restarted it, the experimental has taken a lead with about 45 minutes, I think.
Screenshot_2019-12-24 Dual Geth - Grafana

@holiman
Copy link
Contributor Author

holiman commented Dec 25, 2019

daily update, experimental ahead with about 2h 45m after about two days.
Screenshot_2019-12-25 Dual Geth - Grafana

Here are the commit/hash charts:
Screenshot_2019-12-25 Dual Geth - Grafana(1)

@holiman
Copy link
Contributor Author

holiman commented Dec 26, 2019

daily update, after about 3 x 24h, this PR is about 4h 30m ahead of master.
I squashed/cleaned up the commits, and added a description

@holiman
Copy link
Contributor Author

holiman commented Dec 28, 2019

After five days (around 118h), this PR is at 7.731M, 7h ahead of master at 7.520M,

@holiman holiman mentioned this pull request Dec 28, 2019
@holiman
Copy link
Contributor Author

holiman commented Dec 30, 2019

master reached block 8.6M about 9h25m later than this PR.

@holiman
Copy link
Contributor Author

holiman commented Dec 31, 2019

Mon08 (this PR) finished full sync at block 9188701 at 2019-12-31 03:05:02, after a total of 175h
mon09 was 10 hours behind, at block 9.177M, when the run was stopped.

Screenshot_2019-12-31 Dual Geth - Grafana

Here are some comparison tables. On master, account commit is the fourth 'heaviest' thing, whereas in this PR it dropped to sixth place, after even acount hash.

Screenshot_2019-12-31 Dual Geth - Grafana(1)

@holiman
Copy link
Contributor Author

holiman commented Jan 4, 2020

Added a commit to prevent loading tries when not needed. At the moment, before this commit, we load the (non-existing) storage trie for every origin EOA, for example.

Later on, these tries are all committed -- and although the non-modified storage trie commit exit early, this PR also spins up a goroutine for them, which is now skipped aswell.

@holiman holiman force-pushed the pure_commit branch 2 times, most recently from 539071a to 4ce3930 Compare January 7, 2020 07:50
core/state/state_object.go Outdated Show resolved Hide resolved
core/state/state_object.go Outdated Show resolved Hide resolved
core/state/state_object.go Outdated Show resolved Hide resolved
func (s *stateObject) updateTrie(db Database) Trie {
// Make sure all dirty slots are finalized into the pending storage area
s.finalise()

if len(s.pendingStorage) == 0 {
return s.trie
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious question. What happens if pre-byzantium, one tx modifies the trie but a second one does not. Then pending will be empty I think, but s.trie not nil any more, since a previous already loaded it. Shouldn't we explicitly return nil here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it will enter the commit, and quickly find that the trie root is not dirty. Since we don't provide a leaf callback for storage tries, there will not be a lot of overhead on that operation

trie/committer.go Outdated Show resolved Hide resolved
trie/committer.go Outdated Show resolved Hide resolved
trie/hasher.go Outdated Show resolved Hide resolved
trie/committer.go Outdated Show resolved Hide resolved
trie/committer.go Outdated Show resolved Hide resolved
trie/committer.go Outdated Show resolved Hide resolved
trie/committer.go Outdated Show resolved Hide resolved
return hash, nil
}
if db == nil {
return nil, errors.New("no db provided")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering about this. Should we error or panic? If it's a comitter, should it allow nil db? Isn't that a programming error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also feels a bit weird to explicitly check this for every single trie node. Shoudln't this be checked way way above and here just assumed it's correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If db is nil, it means that the trie.db is nil. I can't say if that's ever done on purpose, e.g. in tests.

trie/committer.go Outdated Show resolved Hide resolved
panic("commit did not hash down to a hashnode!")
}
return hn, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Myeah, I don't think there's much reason for this to exist. You can move the db check into commit and I would not care for the root hash node check (I mean, that's what commit is meant to do, no point in sanity checking that it works).

if common.BytesToHash(newRoot) != rootHash{
panic(fmt.Sprintf("Committed root %x != roothash %x", newRoot, rootHash))
}
t.root = newRoot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I don't think this level of paranoia is needed :D

@karalabe
Copy link
Member

Re: panics, perhaps we can run this PR as is on a benchmarker and nuke and merge afterwards. Then you'll have one last sanity check, but it should really be fine.

@holiman
Copy link
Contributor Author

holiman commented Jan 29, 2020

After a fast-sync

Screenshot_2020-01-29 Dual Geth - Grafana(1)

@holiman
Copy link
Contributor Author

holiman commented Jan 30, 2020

This PR for some reason appears to have more memory held than master.
Screenshot_2020-01-30 Dual Geth - Grafana

However, the master had a different cht, which resulted in the master being able to place a lot more data directly into ancients. I'll nuke and restart a fast-sync (now that the PR is rebased)

@holiman
Copy link
Contributor Author

holiman commented Jan 30, 2020

@holiman
Copy link
Contributor Author

holiman commented Jan 31, 2020

Regarding the mem held, here's the current chart:
Screenshot_2020-01-31 Dual Geth - Grafana

This time, after sync, it's actually master that's holding more memory; however, it seems to correlate to an increase in egress :
Screenshot_2020-01-31 Dual Geth - Grafana(1)

So all in all, I don't think this PR changes memory behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants