Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bug in BlockRefactor V0 syncBlock function #6173

Closed
4 tasks
louisliu2048 opened this issue May 8, 2020 · 2 comments
Closed
4 tasks

Potential bug in BlockRefactor V0 syncBlock function #6173

louisliu2048 opened this issue May 8, 2020 · 2 comments

Comments

@louisliu2048
Copy link

louisliu2048 commented May 8, 2020

Summary of Bug

In tendermint/blockchain/v0/refactor.go: poolRoutine() function, if we get some dirty blocks from other peers when sync blocks, blockExec.ValidateBlock(state, block) will return error and the panic(fmt.Sprintf("Failed to process committed block (%d:%X): %v", first.Height, first.Hash(), err)) will make our node to be a zombine。

...
first, second := bcR.pool.PeekTwoBlocks()
//bcR.Logger.Info("TrySync peeked", "first", first, "second", second)
if first == nil || second == nil {
	// We need both to sync the first block.
	continue FOR_LOOP
} else {
	// Try again quickly next loop.
	didProcessCh <- struct{}{}
}
...

err := state.Validators.VerifyCommit(
	chainID, firstID, first.Height, second.LastCommit)
...

// TODO: batch saves so we dont persist to disk every block
bcR.store.SaveBlock(first, firstParts, second.LastCommit)

// TODO: same thing for app - but we would need a way to
// get the hash without persisting the state
var err error
state, _, err = bcR.blockExec.ApplyBlock(state, firstID, first)
if err != nil {
	// TODO This is bad, are we zombie?
	panic(fmt.Sprintf("Failed to process committed block (%d:%X): %v", first.Height, first.Hash(), err))
}
blocksSynced++
...

If we get a dirty block from other peer, we will save it to our db without check. When we run bcR.blockExec.ApplyBlock, an error will return and out node will panic and exit. If we restart it, it will run replayBlocks and still will panic when run ApplyBlock.

Maybe ,we should do the BlockCheck after
err := state.Validators.VerifyCommit(chainID, firstID, first.Height, second.LastCommit)
to avoid the dirty block from malicious node!

Version

cosmossdk: 0.37.9
tendermint: 0.32.10

Steps to Reproduce

Our cluster run on 7 machine with docker-compose, restart 3 nodes on one machine, It will run into
`
panic: Failed to process committed block (190752:57B2434ACB1BEDA4FD41537A1992E7C64C10DB320415AA289A50A5059FA317A3): Wrong Block.Header.AppHash. Expected 3A8EC21DC8205C41CBDBEA42C4F2C91ED482DAE131110C97FD0EA5426A75791B, got 8E7D8AD3F70B16A1F7E5D50EC0FA01DC7E3AA365C08DBA527A7466D5530187AC

goroutine 133 [running]:
github.com/tendermint/tendermint/blockchain/v0.(*BlockchainReactor).poolRoutine(0xc000dbc1a0)
/go/pkg/mod/github.com/okex/[email protected]/blockchain/v0/reactor.go:344 +0x1542
created by github.com/tendermint/tendermint/blockchain/v0.(*BlockchainReactor).OnStart
/go/pkg/mod/github.com/okex/[email protected]/blockchain/v0/reactor.go:118 +0x84
`
I don't know why they get a dity block after restart, but it do got one and store in their local db.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@louisliu2048 louisliu2048 changed the title Advice for BlockRefactor V0 syncBlock function Potential bug in BlockRefactor V0 syncBlock function May 8, 2020
@tac0turtle
Copy link
Member

hey thank you for reporting this. This issue will be better suited in tendermint. Could you open this issue there?

@louisliu2048
Copy link
Author

hey thank you for reporting this. This issue will be better suited in tendermint. Could you open this issue there?

ou, sorry, I will move it to Tendermint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants