Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simulation: Add benchmarking #2187

Merged
merged 19 commits into from
Sep 3, 2018
Merged

simulation: Add benchmarking #2187

merged 19 commits into from
Sep 3, 2018

Conversation

ValarDragon
Copy link
Contributor

@ValarDragon ValarDragon commented Aug 30, 2018

This PR adds simulation to benchmarking. It is slightly suboptimal of an implementation, as it involves alot of code duplication. However we do want to avoid invariant checking or logging (as logs are significant time increase), amongst other things. I think including this as a separate function is preferable to a single mega function. Once we implement something like an "OperationContext" I will combine these two functions more. (That will make it easier to do cleanly)

Doing this without code duplication was super easy, not sure what I was thinking when I wrote the above.

Sample benchmark output:

/usr/local/go/bin/go test -benchmem -run=^$ github.com/cosmos/cosmos-sdk/cmd/gaia/app -bench ^BenchmarkFullGaiaSimulation$ -cpuprofile cpu.out
Event statistics: 
                                  bank/sendAndVerifyMsgSend/ok => 35
                                     beginblock/signing/missed => 176
                                     beginblock/signing/signed => 905
                               endblock/validatorupdates/added => 77
                             endblock/validatorupdates/updated => 10
                                          gov/MsgDeposit/false => 32
                                           gov/MsgDeposit/true => 7
                                    gov/MsgSubmitProposal/true => 34
                                             gov/MsgVote/false => 29
                                              gov/MsgVote/true => 9
                                      slashing/MsgUnjail/false => 42
                                stake/MsgBeginRedelegate/false => 42
                                 stake/MsgBeginUnbonding/false => 35
                             stake/MsgCompleteRedelegate/false => 35
                              stake/MsgCompleteUnbonding/false => 41
                                stake/MsgCreateValidator/false => 9
                                 stake/MsgCreateValidator/true => 27
                                       stake/MsgDelegate/false => 24
                                        stake/MsgDelegate/true => 10
                                  stake/MsgEditValidator/false => 24
                                   stake/MsgEditValidator/true => 11
Benchmark simulation ran 453 operations
goos: linux
goarch: amd64
pkg: github.com/cosmos/cosmos-sdk/cmd/gaia/app
BenchmarkFullGaiaSimulation-8   	       1	6665959645 ns/op	732741160 B/op	19666343 allocs/op
PASS
ok  	github.com/cosmos/cosmos-sdk/cmd/gaia/app	6.945s

Running this through a profiler, we see that amino unmarshalling takes 75% of the time!!! We should consider using protobuf where applicable, as their implementation has been optimized to enable inlining. (They've fought the golang compiler)

In the process I discovered our current "test validator existswhen deleting it" is broken. I will open a seperate PR to demonstrate that, this PR should be good merge.

  • Targeted PR against correct branch (see CONTRIBUTING.md)

  • Wrote tests

  • Added entries in PENDING.md with issue #

  • rereviewed Files changed in the github PR explorer


For Admin Use:

  • Added appropriate labels to PR (ex. wip, ready-for-review, docs)
  • Reviewers Assigned
  • Squashed all commits, uses message "Merge pull request #XYZ: [title]" (coding standards)

@codecov
Copy link

codecov bot commented Aug 30, 2018

Codecov Report

❗ No coverage uploaded for pull request base (develop@d214952). Click here to learn what that means.
The diff coverage is 75%.

@@            Coverage Diff            @@
##             develop   #2187   +/-   ##
=========================================
  Coverage           ?   64.2%           
=========================================
  Files              ?     140           
  Lines              ?    8579           
  Branches           ?       0           
=========================================
  Hits               ?    5508           
  Misses             ?    2694           
  Partials           ?     377

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 30, 2018

Interestingly, tx processing is only 2% of the entire time here. (No IAVL writes, no signature verification), the rest of the time is just slashing logic in EndBlock, and most of the hold up is in amino unmarshalling. (This translates to about 5.25 out of 7 seconds spent on amino unmarshalling, with only 453 msgs, and 10 blocks. This is a lot of time spent on unmarshalling)

I highly recommend looking through a profilers view of this.

/usr/local/go/bin/go test -benchmem -run=^$ github.com/cosmos/cosmos-sdk/cmd/gaia/app -bench ^BenchmarkFullGaiaSimulation$ -cpuprofile cpu.out -memprofile mem.out
go tool pprof cpu.out
web
go tool pprof mem.out
web

ctx, write := ctx.CacheContext()
result := gov.NewHandler(k)(ctx, msg)
result := handler(ctx, msg)
Copy link
Contributor Author

@ValarDragon ValarDragon Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched from creating a new handler on each message, just to help avoid skewing memory write stats.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

DisplayEvents(events)
}

func updateLog(testingmode bool, log string, update string, args ...interface{}) (updatedLog string) {
if testingmode == true {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

== true can be dropped

@ValarDragon
Copy link
Contributor Author

Use

/usr/local/go/bin/go test -benchmem -run=^$ github.com/cosmos/cosmos-sdk/cmd/gaia/app -bench ^BenchmarkFullGaiaSimulation$ -SimulationGoLevelDB=true -SimulationCommit=true -cpuprofile cpu.out

to profile with GoLevelDB and commits. Even with both of these, amino dominates the runtime. (65%!!) Note that most of these calls are in keeper.UpdateValidator, perhaps we can reduce the number of state reads there.

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 30, 2018

Also I just want to note, that the time signature verification is going to add is neglible. It will add .5ms per operation, which translates to 25ms, which is nothing compared to the time spent in amino.

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 31, 2018

I looked into this a bit more, alot of the problem is that in the simple 10 block example we have, there are 2258 calls to the governance slash function. We should optimize the usage of governance slashing, and the slash functions amino calls. Alternatively we could just cache the keeper.GetValidator calls.

We also should prioritize getting future ops implemented so we can have voting proceed properly.

However I am slightly suspicious about the governance req'd number of blocks. The proposals should be lasting longer than this.

For referrence though, there were 160876 calls to the keeper.GetValidator function (not even keeper.GetValidatorByPubkey)

Copy link
Contributor

@cwgoes cwgoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be better separated, but since it's not modifying the state machine it's a bit less critical. See comments.

blockSize int
enabled bool
verbose bool
usegoleveldb bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

camelCase please

@@ -36,6 +39,8 @@ func init() {
flag.IntVar(&blockSize, "SimulationBlockSize", 200, "Operations per block")
flag.BoolVar(&enabled, "SimulationEnabled", false, "Enable the simulation")
flag.BoolVar(&verbose, "SimulationVerbose", false, "Verbose log output")
flag.BoolVar(&usegoleveldb, "SimulationGoLevelDB", false, "Use GoLevelDB instead of memdb")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just always do this with the benchmark? Looks like this flag doesn't apply to non-benchmarking simulation anyways

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

ctx, write := ctx.CacheContext()
result := gov.NewHandler(k)(ctx, msg)
result := handler(ctx, msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


// Log the header time for future lookup
pastTimes = append(pastTimes, header.Time)
if !testingmode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

camelCase

evidence := make([]abci.Evidence, 0)
for r.Float64() < evidenceFraction {
height := header.Height
time := header.Time
if r.Float64() < pastEvidenceFraction {
height = int64(r.Intn(int(header.Height)))
time = pastTimes[height]
time = lastHeaderTime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change? Isn't this supposed to be the timestamp when the infraction was committed (which was the previous header)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your right, my bad. I misread height as header.Height, not sure how I missed the random.

@ValarDragon
Copy link
Contributor Author

I actually think we should consider removing gocyclo. I think it does more harm then good. To fix gocyclo errors, I had to refactor some of the code, which made this PR hard to review unless you were going commit by commit.

@ValarDragon ValarDragon force-pushed the dev/benchmark_simulation branch from a43a7f8 to 5643c08 Compare September 1, 2018 19:38
@ValarDragon ValarDragon mentioned this pull request Sep 1, 2018
4 tasks
In simulation, this was shown to cause a 4x speedup. There are no safety
concerns here, as amino encoding is deterministic.
@ValarDragon
Copy link
Contributor Author

ValarDragon commented Sep 1, 2018

After properly initializing governance proposal time (Thanks Sunny!) the simulator can quickly get to much larger block sizes. (Currently set to 210 blocks, this has 10 blocks of governance slashing)

This now shows that GetValidator, and the way we do sorting in the iterators are bottlenecks. The PR for GetValidator still eliminates that part of the time, which is quite significant. Jae and I discussed a bit about what the cause of the iterator being slow is, and how to speed it up, I'll write that up into another issue. (However we could make that postlaunch, as its probably not going to be a simple change like the get validator change was)

Copy link
Contributor

@cwgoes cwgoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@cwgoes cwgoes merged commit 7f1b06a into develop Sep 3, 2018
@ValarDragon ValarDragon deleted the dev/benchmark_simulation branch September 4, 2018 04:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants