Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chain ancestry hash storage #19319

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions core/blockchain.go
Original file line number Diff line number Diff line change
Expand Up @@ -1679,6 +1679,7 @@ func (bc *BlockChain) insertChain(chain types.Blocks, verifySeals bool) (int, []

switch status {
case CanonStatTy:
bc.hc.hashHistory.Set(block.Header())
log.Debug("Inserted new block", "number", block.Number(), "hash", block.Hash(),
"uncles", len(block.Uncles()), "txs", len(block.Transactions()), "gas", block.GasUsed(),
"elapsed", common.PrettyDuration(time.Since(start)),
Expand Down Expand Up @@ -1967,6 +1968,8 @@ func (bc *BlockChain) reorg(oldBlock, newBlock *types.Block) error {
// Write lookup entries for hash based transaction/receipt searches
rawdb.WriteTxLookupEntries(bc.db, newChain[i])
addedTxs = append(addedTxs, newChain[i].Transactions()...)
// Add to hash history
bc.hc.hashHistory.Set(newChain[i].Header())
}
// When transactions get deleted from the database, the receipts that were
// created in the fork must also be deleted
Expand Down Expand Up @@ -2224,3 +2227,8 @@ func (bc *BlockChain) SubscribeLogsEvent(ch chan<- []*types.Log) event.Subscript
func (bc *BlockChain) SubscribeBlockProcessingEvent(ch chan<- bool) event.Subscription {
return bc.scope.Track(bc.blockProcFeed.Subscribe(ch))
}

//GetAncestorHash return the hash of ancestor at the given number
func (bc *BlockChain) GetAncestorHash(ref *types.Header, target uint64) common.Hash {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to put this logic in HeaderChain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it should probably wind up there instead. Right now, however, it's only used for EVM execution of BLOCKHASH, and I was more confident in how the blockchain works than how the headerchain works.

If this is becomes used from any other place, like p2p, then it should definitely go into headerchain instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Thanks for clarifying.

return bc.hc.GetAncestorHash(ref, target)
}
92 changes: 92 additions & 0 deletions core/blockchain_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2363,3 +2363,95 @@ func TestDeleteCreateRevert(t *testing.T) {
t.Fatalf("block %d: failed to insert into chain: %v", n, err)
}
}

// TestImportlongSidechainWithBlockhash200 imports a canonical chain of 500 blocks,
// and then finds out about a longer sidechain which forked off at block 200 (300 blocks ago).
// It will then import the sidechain, and execute the transactions.
// Now, at block 450, there is a transaction which executes BLOCKHASH(205), and stores
// the result in slot(0).
// This test is meant to ensure that the hashbuffer-based GetAncestorHash correctly
// handles non-canon lookups
func TestImportlongSidechainWithBlockhash200(t *testing.T) {

var (
aa = common.HexToAddress("0x000000000000000000000000000000000000aaaa")
engine = ethash.NewFaker()
db = rawdb.NewMemoryDatabase()
// A sender who makes transactions, has some funds
key, _ = crypto.HexToECDSA("b71c71a67e1177ad4e901695e1b4b9ee17ae16c6668d313eac2f96dbcda3f291")
address = crypto.PubkeyToAddress(key.PublicKey)
funds = big.NewInt(1000000000)
gspec = &Genesis{
Config: params.TestChainConfig,
Alloc: GenesisAlloc{
address: {Balance: funds},
// The address 0xAAAAA stores BLOCKHASH in slot zero
aa: {
// Push
Code: []byte{
byte(vm.PUSH1), byte(205), // [205]
byte(vm.BLOCKHASH), // [ bh(205) ]
byte(vm.PUSH1), byte(0), // [bh(205), 0] ; location
byte(vm.SSTORE), // []
},
Nonce: 1,
Balance: big.NewInt(0),
},
},
}
genesis = gspec.MustCommit(db)
)

blocks, _ := GenerateChain(params.TestChainConfig, genesis, engine, db, 500, func(i int, b *BlockGen) {
b.SetCoinbase(common.Address{1})
})
// Import the canonical chain
diskdb := rawdb.NewMemoryDatabase()
gspec.MustCommit(diskdb)
chain, err := NewBlockChain(diskdb, nil, params.TestChainConfig, engine, vm.Config{}, nil)
if err != nil {
t.Fatalf("failed to create initial chain: %v", err)
}
if n, err := chain.InsertChain(blocks); err != nil {
t.Fatalf("block %d: failed to insert into chain: %v", n, err)
}
// Ok, the canon chain is done. Now generate the competing chain. In order to be able to
// do so, we need to do it in two steps, so we can call b.AddTxWithChain, and have the
// sidechain blocks be in the chain already
forkBlockNum := 200
// Add 200 sideblocks
sideblocks1, _ := GenerateChain(params.TestChainConfig, blocks[forkBlockNum], engine, db, 248, func(i int, b *BlockGen) {
b.SetCoinbase(common.Address{2})
})
if n, err := chain.InsertChain(sideblocks1); err != nil {
t.Fatalf("block %d: failed to insert into chain: %v", n, err)
}
// Add another 150 blocks
sideblocks2, _ := GenerateChain(params.TestChainConfig, sideblocks1[len(sideblocks1)-1], engine, db, 150, func(i int, b *BlockGen) {
b.SetCoinbase(common.Address{2})
if i == 0 {
tx, _ := types.SignTx(types.NewTransaction(0, aa,
big.NewInt(0), 50000, big.NewInt(1), nil), types.HomesteadSigner{}, key)
b.AddTxWithChain(chain, tx)
}
})
if n, err := chain.InsertChain(sideblocks2); err != nil {
t.Fatalf("block %d: failed to insert into chain: %v", n, err)
}
if got, exp := chain.CurrentBlock().Hash(), sideblocks2[len(sideblocks2)-1].Hash(); got != exp {
t.Fatalf("error, exp %x got %x", got, exp)
}
statedb, err := chain.State()
if err != nil {
t.Fatalf("failed to get state: %v", err)
}
val := statedb.GetState(aa, common.Hash{})
// Check canon
if got, exp := val, chain.GetCanonicalHash(205); got != exp {
t.Fatalf("wrong hash, got %x exp %x", got, exp)
}
// Sanity check against the sideblocks
if got, exp := val, sideblocks1[3].Hash(); got != exp {
t.Fatalf("wrong hash, got %x exp %x", got, exp)
}
}
25 changes: 3 additions & 22 deletions core/evm.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ type ChainContext interface {
// Engine retrieves the chain's consensus engine.
Engine() consensus.Engine

// GetHeader returns the hash corresponding to their hash.
GetHeader(common.Hash, uint64) *types.Header
// GetAncestorHash return the hash of ancestor at the given number
GetAncestorHash(child *types.Header, number uint64) common.Hash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need the original GetHeader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in the interface, but we still need the implementation(s).. I can remove it from the interface

}

// NewEVMContext creates a new context for use in the EVM.
Expand Down Expand Up @@ -60,27 +60,8 @@ func NewEVMContext(msg Message, header *types.Header, chain ChainContext, author

// GetHashFn returns a GetHashFunc which retrieves header hashes by number
func GetHashFn(ref *types.Header, chain ChainContext) func(n uint64) common.Hash {
var cache map[uint64]common.Hash

return func(n uint64) common.Hash {
// If there's no hash cache yet, make one
if cache == nil {
cache = map[uint64]common.Hash{
ref.Number.Uint64() - 1: ref.ParentHash,
}
}
// Try to fulfill the request from the cache
if hash, ok := cache[n]; ok {
return hash
}
// Not cached, iterate the blocks and cache the hashes
for header := chain.GetHeader(ref.ParentHash, ref.Number.Uint64()-1); header != nil; header = chain.GetHeader(header.ParentHash, header.Number.Uint64()-1) {
cache[header.Number.Uint64()-1] = header.ParentHash
if n == header.Number.Uint64()-1 {
return header.ParentHash
}
}
return common.Hash{}
return chain.GetAncestorHash(ref, n)
}
}

Expand Down
173 changes: 173 additions & 0 deletions core/hashbuffer.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
// Copyright 2019 The go-ethereum Authors
// This file is part of the go-ethereum library.
//
// The go-ethereum library is free software: you can redistribute it and/or modify
// it under the terms of the GNU Lesser General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// The go-ethereum library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with the go-ethereum library. If not, see <http://www.gnu.org/licenses/>.

package core

import (
"sync"

"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/core/types"
"github.com/ethereum/go-ethereum/log"
"github.com/ethereum/go-ethereum/metrics"
)

// For EVM execution, we need around 256 items. We add a few more to allow reorgs.
// For mainnet, a couple more would suffice, but a few more added for
// testnets/private nets
// For LES, we use a larger buffer. 500K * 32 bytes = 16M
const hashBufferElems = 500000

var (
hashHitCounter = metrics.NewRegisteredGauge("chain/headerhash/hit", nil)
hashMissCounter = metrics.NewRegisteredGauge("chain/headerhash/miss", nil)
hashHeadGauge = metrics.NewRegisteredGauge("chain/headerhash/head", nil)
hashTailGauge = metrics.NewRegisteredGauge("chain/headerhash/tail", nil)
)

// hashBuffer implements a storage for chains of hashes, intended to be used for quick lookup of block hashes.
// Internally, it uses an array of hashes in a circular buffer.
// It enforces that all hashes added have a contiguous parent-child relation, and supports rollbacks
// It is thread-safe.
type hashBuffer struct {
// The data holds the hashes. The hashes are sequential, but also a
// circular buffer.
// The `head` points to the position of the latest hash.
// The parent, if present, is located 32 bytes back.
// [.., .., ..., head-2 , head-1, head, oldest, ... ]
data [hashBufferElems]common.Hash

head uint64 // index of hash for the head block

headNumber uint64 // The block number for head (the most recent block)
tailNumber uint64 // The block number for tail (the oldest block)

holiman marked this conversation as resolved.
Show resolved Hide resolved
mu sync.RWMutex
}

holiman marked this conversation as resolved.
Show resolved Hide resolved
// newHashBuffer creates a new storage with a header in it.
// Since we take a header here, a hash storage can never be empty.
// This makes things easier later on (in Set)
func newHashBuffer(header *types.Header) *hashBuffer {
return &hashBuffer{
headNumber: header.Number.Uint64(),
tailNumber: header.Number.Uint64(),
data: [hashBufferElems]common.Hash{header.Hash()},
}
}

// Get locates the hash for the requested number
func (hs *hashBuffer) Get(number uint64) (common.Hash, bool) {
hs.mu.RLock()
defer hs.mu.RUnlock()
if !hs.has(number) {
hashMissCounter.Inc(1)
return common.Hash{}, false
}
hashHitCounter.Inc(1)
distance := hs.headNumber - number
index := (hs.head + hashBufferElems - distance) % hashBufferElems
return hs.data[index], true
}

// has returns if the storage has a hash for the given number
func (hs *hashBuffer) has(number uint64) bool {
return number <= hs.headNumber && number >= hs.tailNumber
}

// Contains checks if the hash at the given number matches the expected
func (hs *hashBuffer) Contains(number uint64, expected common.Hash) bool {
hs.mu.RLock()
defer hs.mu.RUnlock()
if hs.contains(number, expected) {
hashHitCounter.Inc(1)
return true
} else {
hashMissCounter.Inc(1)
return false
}
}

// contains is the non-concurrency safe internal version of Contains
func (hs *hashBuffer) contains(number uint64, expected common.Hash) bool {
if !hs.has(number) {
return false
}
distance := hs.headNumber - number
index := (hs.head + hashBufferElems - distance) % hashBufferElems
return hs.data[index] == expected
}

// Newest returns the most recent (number, hash) stored
func (hs *hashBuffer) Newest() (uint64, common.Hash) {
hs.mu.RLock()
defer hs.mu.RUnlock()
return hs.headNumber, hs.data[hs.head]
}

// Oldest returns the oldest (number, hash) found
func (hs *hashBuffer) Oldest() (uint64, common.Hash) {
hs.mu.RLock()
defer hs.mu.RUnlock()
distance := hs.headNumber - hs.tailNumber
index := (hs.head + hashBufferElems - distance) % hashBufferElems
return hs.tailNumber, hs.data[index]
}

// Set inserts a new header (hash) to the storage.
// If
// a) Header already exists, this is a no-op
// b) Number is occupied by other header, the new header replaces it, and also
// truncates any descendants
//
// If the new header does not have any ancestors, it replaces the entire storage.
func (hs *hashBuffer) Set(header *types.Header) {
var (
number = header.Number.Uint64()
index uint64
hash = header.Hash()
)
hs.mu.Lock()
defer hs.mu.Unlock()
if hs.contains(number-1, header.ParentHash) {
if hs.headNumber >= number {
distance := hs.headNumber - number
index = (hs.head + hashBufferElems - distance) % hashBufferElems
if hs.data[index] == hash {
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about this return. If sync fails, I do a rollback of 16 blocks, and reimport those again, then they will not wipe subsequent ones. Although it should not cause any issues, I'm wondering if it would be more predictable to wipe everything after the inserted header even it it's a noop. Might be easier to reason about if a Set surely sets the header as the head and leaves nothing in there above it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should remain as is. If we re-add an existing header, there's no point intentionally removing data that might be useful. The first forking header will wipe it.

}
// Continue by replacing this number and wipe descendants
} else {
// head is parent of this new header - regular append
index = (hs.head + 1) % hashBufferElems
}
} else {
// This should not normally happen, and indicates a programming error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't a side-chain import also trigger this path? If so, it's not a programming error.

Copy link
Contributor Author

@holiman holiman Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if the sidechain import is old, older than 288 blocks. Thus 'not normally happen'. We only add hashes of canon blocks, from blockchain.go, either in insertChain or reorg

What I meant by programming error, is that if we integrate it erroneously into e.g. the header chain, and spuriously add blocks from sidechains, then we'd start seeing this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another case of this:

  • We initialize the hashStorage with genesis header
  • We get a header from the network but it's not consecutive with genesis.

Maybe we should consider initialize the hashStorage with latestHeader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get a header from the network but it's not consecutive with genesis.

Why would we ever import a header that is not consecutive with genesis? And if we were to do so, why is it not correct to wipe ancestors?

log.Error("Hash storage wiping ancestors", "oldhead", hs.headNumber, "newhead", number, "oldtail", hs.tailNumber)
// Wipe ancestors
hs.tailNumber = number
}
hs.head = index
hs.headNumber = number
hs.data[hs.head] = hash

if number-hs.tailNumber == hashBufferElems {
// It's full, need to move the tail
hs.tailNumber++
}
hashTailGauge.Update(int64(hs.tailNumber))
hashHeadGauge.Update(int64(hs.headNumber))
}
Loading