-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blockchain, cmd, netsync, main: Add utxocache #1955
Conversation
dee5dab
to
6c278f4
Compare
Pull Request Test Coverage Report for Build 5011650578Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Giving this a shot for a mainnet sync, will report back if it gets to a higher height than referenced above (I think the past I ran into an issue at that height too). Running with the following config:
|
Had to restart my sync as had some memory pressure limits incorrectly configured (though surprised it was killed given the box has 32 GB of RAM and was running with a 10 GB cache size). I also think the current cache size estimate might be significantly undercounting the true over head. Will do some profiles to dig into things. |
Heap profile after around restart with 3 GB of cache used (note how Aside from cache, can see that memory building up for: cloning the cache entries, and also just decoding transactions themselves. Re tx decoding, we have some PRs up that attempt to reduce garbage during that process, and also cache the serialization to make hashing faster (but this is still in check pointed land), but ofc one task at a time. Logs of cache size+speed:
|
One other thing I notice due to the cache design (purge everything once full, instead of just evicting on the fly) is that post purge, there's a significant bump in I/O to refill the cache we guarantee that the very next block triggers thousands of cache misses. The machine I'm running on has a pretty fast SSD (Samsung 980 Pro M.2 Gen 4) but validation still sees a tangible drop off due to the increased I/O. An alternative strategy would be something like:
|
Yeah I've noticed the slowdowns during flushes as well. Memory allocation during flushes were a big slowdown for me as well. Here's a snippet of my log during ibd (6GB of cache, GOGC=100, memlimit of 30GB):
Was this taken with the flag --profile? I'll check up on the size calculations with the utxo cache.
So for the Utreexo project, we did some testing on efficient methods of caching utxo entries and @adiabat + @cbao1198 had concluded that evicting old entries after the threshold you set is reached. There's a relevant github discussion here mit-dci/utreexo#325 but iirc there were graphs generated to visualize these. I'll look into the evicting method first since there's some evidence of it being good. |
Can Go release memory immediately after a variable is released? It looks like there are two caches here. 1 cache TX lookup. 2 buffer for writing to disk. The lookup cache can be a FIFO queue, where items are moved if they are being hit. The oldest part of FIFO is written to the disk periodically. There could be good static memory structure for such data that doesn't require GC. Is that what eviction means? |
Verified that the memory calculation is wrong. Working on a fix and will push a fix soon. |
With arenas I believe you could. But it wouldn't be for a single variable.
Yeah basically. |
Yeah, then using this command:
(cpu)
So I was able sync all the way up to current day! I don't have a good sync time, as I had to restart a few times, before settling into a 5 GB cache (w/ the default I'll do another run now, with the latest version (after the fix) and this tuned cache size. |
A hit is basically a deletion: if you're looking up a UTXO, it's because it's been spent in the first place. The cache as is, lets us avoid unnecessary random writes: UTXOs that die young are never written to disk. However, "young" here is relative to the cache size. Once the cache overflows, we write it all to disk, then start from scratch, which guarantees a cold cache (I/O spikes after the purge, as all UTXOs need to be loaded in). Re GC optimization:
One approach (basically a middle ground to the Arena, but also complementary to it potentially), would be to pre-allocate al the UTXO entries in a single swoop, in a single slice. This would allocate all the entries to contiguous (virtual) memory. We'd keep an index of the last unallocated (not set to anything) entry in this slice. We'd then use this to create new UTXO entries, as adding a new entry will just be adding a pointer to the map, instead of actually allocating. Assuming we keep the current purge behavior, then then entire slice (in theory) can be de-allocated all at once, as all the refs in the map have been deleted and also the main slice pointer to longer held onto. Going further, we could then also actually never de-allocate the slice. We'd just reset that pointer and know that anything "after" that is actually blank. This is more or less the "memory ballast" approach outlined here: https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i-learnt-to-stop-worrying-and-love-the-heap/. IIUC, this has basically been subsumed by the arena allocator, as it gives you a nicer API (that uses generics). Re this:
I think the reason the cache is designed the way it is rn (copied from bitcoind) was to enable recovery from unclean shutdowns (to avoid a cache flush each block). I need to revisit that design to properly analyze it, but maybe it turns out this isn't so critical after all. One other side effect of this is:
Also this flag ( |
b34a14b
to
e16b0c1
Compare
blockchain/sizehelper.go
Outdated
// I suspect it's because of alignment and how the compiler handles it but | ||
// when compared with how much the compiler allocates, it's a couple hundred | ||
// bytes off. | ||
func CalculateRoughMapSize(hint int, bucketSize uintptr) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there're functions in the unsafe
or runtime
package to make this calculation a bit more future proof. Still need to review the improvements, but possible that this behavior isn't consistent across Go versions, or a new version overhauls the map internals which adds estimate drift to our calculations here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came upon this: https://stackoverflow.com/a/31855111
But maybe you've seen it already, using unsafe.Sizeof
on the map entry itself may also allow us to get a tighter estimate.
In any case, will do a run with this to see how closely the reported size matches up the the visible RSS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there're functions in the
unsafe
orruntime
package to make this calculation a bit more future proof. Still need to review the improvements, but possible that this behavior isn't consistent across Go versions, or a new version overhauls the map internals which adds estimate drift to our calculations here.
That could definitely happen. You could grab the raw value of B in struct hmap
like how dgraph-io people did. https://github.com/dgraph-io/dgraph/blob/main/posting/size.go but if struct hmap
ends up changing how their fields are ordered, it could break that code as well.
I came upon this: https://stackoverflow.com/a/31855111
But maybe you've seen it already, using unsafe.Sizeof on the map entry itself may also allow us to get a tighter estimate.
I didn't see that stackoverflow answer but the selected answer is wrong. The answer states that len(theMap)
should be taken to calculate how much the memory the map is taking up but since the map allocates in powers of 2 (+few overflow buckets), that answer will be a lot different than what the actual size of the map is.
unsafe.Sizeof
isn't actually unsafe and it only returns a constant value of the fixed part of each data structure, like the pointer and length of a string, but not indirect parts like the contents of the string (pg. 354 of The Go Programming Language by Alan Donovan and Brian Kernighan. So unsafe.Sizeof()
will always return 48 bytes for hmap
in a 64bit system as the entire struct takes up 6 words. It wouldn't help us better calculate the size of how much memory the buckets are taking up.
What I still don't have an exact answer for is the alignment of wire.Outpoint
in the map. wire.Outpoint
is 36bytes but since memory is allocated per word, we'll allocate 40 bytes for wire.Outpoint
. However, when used in the map, I think the compiler does some optimizations and we actually end up taking up less than 40 bytes when pre-allocating for a lot of key/value pairs. I suspect this is at least one reason why the map size isn't accurate down to the byte.
blockchain/utxocache.go
Outdated
nbEntries := uint64(len(s.cachedEntries)) | ||
// Total memory is the map size + the size that the utxo entries are | ||
// taking up. | ||
size := uint64(CalculateRoughMapSize(s.maxElements, bucketSize)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
blockchain/utxocache.go
Outdated
//TODO(stevenroose) check if it's ok to drop the pkScript | ||
// Since we don't need it anymore, drop the pkScript value of the entry. | ||
s.totalEntryMemory -= entry.memoryUsage() | ||
entry.pkScript = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re the commit body, yeah this is something we've seen on the lnd
side as well: just holding onto a single pointer of transaction still points to the entire allocated script buffer for a block.
Isn't this copied over before hand though? So setting to nil
would actually let it be freed up?
There's also another path of using a fixed sized byte array, given that for most scripts they're always under 40 bytes or so, aside from bare multi-sig early in the lifetime of the chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll save the pprofs and post it to github but what I've noticed was msgTx.Decode()
took up a significant size when the utxomaxcachesize
was large enough. Profiling the master branch didn't have msgTx.Decode()
taking up significant amount of heap.
Looking at the source code for msgTx.Decode()
we can see that the deserialized pkscript is actually copied over to the large contiguous byte slice and the pkscript that was allocated separately is returned to the scriptPool.
Lines 661 to 676 in a18c2cf
for i := 0; i < len(msg.TxOut); i++ { | |
// Copy the public key script into the contiguous buffer at the | |
// appropriate offset. | |
pkScript := msg.TxOut[i].PkScript | |
copy(scripts[offset:], pkScript) | |
// Reset the public key script of the transaction output to the | |
// slice of the contiguous buffer where the script lives. | |
scriptSize := uint64(len(pkScript)) | |
end := offset + scriptSize | |
msg.TxOut[i].PkScript = scripts[offset:end:end] | |
offset += scriptSize | |
// Return the temporary script buffer to the pool. | |
scriptPool.Return(pkScript) | |
} |
These two bits of information gives me a high confidence that the script for a whole block is sticking around because we have a single entry that's still pointing to it. I know for a fact that go doesn't garbage collect parts of a slice as well.
So setting to nil would actually let it be freed up?
Yeah I think that's right in that we should still be replacing with nil since once every last one of the entries stops pointing to it, that whole block's script can be gc-ed. Just really shouldn't be discounting the pkscript after we nil it out since it can still be taking up space.
blockchain/utxocache.go
Outdated
@@ -232,7 +232,7 @@ type utxoBatcher interface { | |||
// getEntries attempts to fetch a serise of entries from the UTXO view. | |||
// If a single entry is not found within the view, then an error is | |||
// returned. | |||
getEntries(outpoints map[wire.OutPoint]struct{}) (map[wire.OutPoint]*UtxoEntry, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I guess "conventional " wisdom is that struct{}
or interface{}
doesn't add over head, but that doesn't factor in the map bucket overhead 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I've noticed the map bucket overhead messing with the size calculation.
No need for a map here anyways so I think a slice is a strict improvement.
blockchain/utxocache.go
Outdated
@@ -640,7 +632,7 @@ func (s *utxoCache) Commit(view *UtxoViewpoint, node *blockNode) error { | |||
if node != nil && isBIP0030Node(node) { | |||
// Use the entry from the view since we want to overwrite the | |||
// cached entry. | |||
cachedEntry = entry.Clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cloning also showed up a lot in my memory profiles, what makes it no longer required? Fact that these never get mutated anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah that was my finding. If the entry is used internally within the blockchain package, we can make sure it's not modified so no need to Clone(). Also, I wasn't sure how much the Clone() does to help with immutability since the pkscript is still mutable as the Clone() here is a shallow clone.
I saw that the mining code does fetch for entries from the tx and one of the tests fails if I don't call Clone() for the entries that are fetched by public functions. But not calling Clone() for functions that are private didn't have any effect and all the code that calls private fetch functions do not mutate the entries.
blockchain/utxocache.go
Outdated
func (ms *mapSlice) deleteMaps() { | ||
// The rest of the map is now available to be garbage collected. | ||
m := ms.maps[0] | ||
ms.maps = []map[wire.OutPoint]*UtxoEntry{m} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this once again pre-allocate? Otherwise, it needs to go through the successive doubling as new items are added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The oldest map/largest is kept and the smaller ones are left out so they can be gc-ed.
The current algorithm will take a look at the leftover memory and allocate a map that will fit inside of that. My hunch is that since the entry sizes can vary, it's better to let the smaller map be gc-ed so the second map size can be dynamic based on the needs at that moment.
Found a bug in Working towards a fix. |
cc @halseth |
The last 3 commits alleviate the |
6736516
to
0112042
Compare
blockchain/utxocache.go
Outdated
// outpointSize is the size of an outpoint in a 64-bit system. | ||
// | ||
// This value is calculated by running the following on a 64-bit system: | ||
// unsafe.Sizeof(wire.OutPoint{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Index is explicitly int32, so 64-bit system or not doesn't matter IIUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to look up some docs to make sure.
I think you're right. [32]byte should be 32 on 32bit systems and uint32 should be 4. I'll remove the comments like these in other parts of the code as well.
blockchain/utxocache.go
Outdated
for _, m := range ms.maps { | ||
entry, found = m[op] | ||
if found { | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: return early with the found element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like return entry, found
instead of break
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
Out of draft now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just going to test this out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🦖
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐊
I think this is ready to land after a rebase (fold in the fixup commits).
There's a lot of code that would violate the lnd linter/style, but we haven't yet ported over the latest version of our linter to btcd
. I'll make a new PR to port that over, and fix the violations once this lands.
// Return early if the view is nil. | ||
if view == nil { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: missing new line before the lcosing brace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
blockchain/utxocache.go
Outdated
// The returned entries are NOT safe for concurrent access. | ||
func (s *utxoCache) fetchEntries(outpoints []wire.OutPoint) ([]*UtxoEntry, error) { | ||
entries := make([]*UtxoEntry, len(outpoints)) | ||
var missingOps []wire.OutPoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: can be compresed under a single var statement:
var (
missingOps []wire.OutPoint
missingOpsIdx []int
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
blockchain/utxocache.go
Outdated
// unspendable. When the cache already has an entry for the output, it will be | ||
// overwritten with the given output. All fields will be updated for existing | ||
// entries since it's possible it has changed during a reorg. | ||
func (s *utxoCache) addTxOut( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elsewhere we usually just fill out the first level of arguments till it exceeds 80 chars, then continue on the next line, so:
func (s *utxoCache) addTxOut(outpoint wire.OutPoint, txOut *wire.TxOut, isCoinBase bool,
blockHeight int32) error {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
|
||
entry := new(UtxoEntry) | ||
entry.amount = txOut.Value | ||
// Deep copy the script when the script in the entry differs from the one in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: missing new line above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
} | ||
|
||
s.cachedEntries.put(outpoint, entry, s.totalEntryMemory) | ||
s.totalEntryMemory += entry.memoryUsage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't cachedEntries
maintain totalEntryMemory
as an internal attribute? Then it can just update it internally rather than us passing the attribute in to modify right below. You mentioned above it doesn't know the entry size, but IIUC, that's what we have entry.memoryUsage
for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a patch for this but I'm not confident that it's a small change. Is it ok if I do a follow up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
blockchain/utxocache.go
Outdated
entry := entries[0] | ||
if stxos != nil { | ||
// Populate the stxo details using the utxo entry. | ||
var stxo = SpentTxOut{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why var
vs the normal :=
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
// be updated to append an entry for each spent txout. An error will be returned | ||
// if the cache and the database does not contain the required utxos. | ||
func (s *utxoCache) connectTransaction( | ||
tx *btcutil.Tx, blockHeight int32, stxos *[]SpentTxOut) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have a pointer to the slice passed in which is mutated each time, vs just returning the set of UTXOs spent due to the transaction? Is the goal to reduce memory churn my pre-allocating stxos
ahead of time? Current usage made the control flow a bit harder to follow IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. There wasn't much thought that went into this besides just re-using the function prototype for connectTransaction
that was in the current codebase. I'm assuming that was the intention when previous authors wrote this.
blockchain/utxocache.go
Outdated
utxoBucket := dbTx.Metadata().Bucket(utxoSetBucketName) | ||
for i := range s.cachedEntries.maps { | ||
for outpoint, entry := range s.cachedEntries.maps[i] { | ||
// If the entry is nil or spent, remove the entry from the database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: delete(s.cachedEntries.maps[i], outpoint)
appears 3 times in the loop, we can reduce it to one:
switch {
case entry == nil || entry.IsSpent():
err := dbDeleteUtxoEntry(utxoBucket, outpoint)
if err != nil {
return err
}
case !entry.isModified():
default:
// Entry is fresh and needs to be put into the database.
err := dbPutUtxoEntry(utxoBucket, outpoint, entry)
if err != nil {
return err
}
}
delete(s.cachedEntries.maps[i], outpoint)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
// get changed during the execution of this method. | ||
func (b *BlockChain) InitConsistentState(tip *blockNode, interrupt <-chan struct{}) error { | ||
s := b.utxoCache | ||
// Load the consistency status from the database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: missing space above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
This change is part of the effort to add utxocache support to btcd. sizehelper introduces code for 2 main things: 1: Calculating how many entries to allocate for a map given a size in bytes. 2: Calculating how much a map takes up in memory given the entries were allocated for the map. These functionality are useful for allocating maps so that they'll be allocating below a certain number of bytes. Since go maps will always allocate in powers of B (where B is the bucket size for the given map), it may allocate too much memory. For example, for a map that can store 8GB of entries, the map will grow to be 16GB once the map is full and the caller puts an extra entry onto the map. If we want to give a memory guarantee to the user, we can either: 1: Limit the cache size to fixed sizes (4GB, 8GB, ...). 2: Allocate a slice of maps. The sizehelper code helps with (2).
4fce9c0
to
557c4b3
Compare
Rebased and addressed the comments from roasbeef |
This change is part of the effort to add utxocache support to btcd. mapslice allows the caller to allocate a fixed amount of memory for the utxo cache maps without the mapslice going over that fixed amount of memory. This is useful as we can have variable sizes (1GB, 1.1GB, 2.3GB, etc) while guaranteeing a memory limit.
This change is part of the effort to add utxocache support to btcd. The fresh flag indicates that the entry is fresh and that the parent view (the database) hasn't yet seen the entry. This is very useful as a performance optimization for the utxo cache as if a fresh entry is spent, we can simply remove it from the cache and don't bother trying to delete it from the database.
This change is part of the effort to add utxocache support to btcd. Getting the memory usage of an entry is very useful for the utxo cache as we need to know how much memory all the cached entries are using to guarantee a cache usage limit for the end user.
This change is part of the effort to add utxocache support to btcd. The utxoStateConsistency indicates what the last block that the utxo cache got flush at. This is useful for recovery purposes as if the node is unexpectdly shut down, we know which block to start rebuilding the utxo state from.
This change is part of the effort to add utxocache support to btcd. dbPutUtxoView handled putting and deleting new/spent utxos from the database. These two functinalities are refactored to their own functions: dbDeleteUtxoEntry and dbPutUtxoEntry. Refactoring these out allows the cache to call these two functions directly instead of having to create a view and saving that view to disk.
This change is part of the effort to add utxocache support to btcd. connectBlock may have an empty utxoviewpoint as the block verification process may be using the utxo cache directly. In that case, a nil utxo viewpoint will be passed in. Just return early on a nil utxoviewpoint.
This change is part of the effort to add utxocache support to btcd. Require the caller to pass in the utxoBucket as the caller may be fetching many utxos in one loop. Having the caller pass it in removes the need for dbFetchUtxoEntry to grab the bucket on every single fetch.
This change is part of the effort to add utxocache support to btcd. fetchInputUtxos had mainly 2 functions: 1: Figure out which outpoints to fetch 2: Call fetchUtxosMain to fetch those outpoints Functionality for (1) is refactored out to fetchInputsToFetch. This is done to allow fetchInputUtxos to use the cache to fetch the outpoints as well in a later commit.
The implemented utxocache implements connectTransactions just like utxoviewpoint and can be used as a drop in replacement for connectTransactions. One thing to note is that unlike the utxoViewpoint, the utxocache immediately deletes the spent entry from the cache. This means that the utxocache is unfit for functions like checkConnectBlock where you expect the entry to still exist but be marked as spent. disconnectTransactions is purposely not implemented as using the cache during reorganizations may leave the utxo state inconsistent if there is an unexpected shutdown. The utxoViewpoint will still have to be used for reorganizations.
This change is part of the effort to add utxocache support to btcd. utxo cache is now used by the BlockChain struct. By default it's used and the minimum cache is set to 250MiB. The change made helps speed up block/tx validation as the cache allows for much faster lookup of utxos. The initial block download in particular is improved as the db i/o bottleneck is remedied by the cache.
PruneBlocks used to delete files immediately before the database transaction finished. By making the prune atomic, we can guarantee that the database flush will happen before the utxo cache is flushed, ensuring that the utxo cache is never in an irrecoverable state.
flushNeededAfterPrune returns true if the utxocache needs to be flushed after the pruning of the given slice of block hashes. For the utxo cache to be recoverable while pruning is enabled, we need to make sure that there exists blocks since the last utxo cache flush. If there are blocks that are deleted after the last utxo cache flush, the utxo set is irrecoverable. The added method provides a way to tell if a flush is needed.
If the prune will delete block past the last flush hash of the utxocache, the cache will need to be flushed first to avoid a case where the utxocache is irrecoverable. The newly added code adds this flush logic to connectBlock.
The testing function in export_test.go is changed to just export.go so that callers outside the ffldb package will be able to call the function. The main use for this is so that the prune code can be triggered from the blockchain package. This allows testing code to have less than 1.5GB worth of blocks to trigger the prune.
557c4b3
to
87a81f1
Compare
All fixup commits squashed now in the latest push. |
Updated the PR and is now very different from #1373.
Main differences are:
No utxoviewpoint middleman during headers-first sync
In #1373, the view is still used as a middleman. The flow for doing block validation would be:
The problems with these approach were:
(2) is extremely expensive as there a lot of memory allocated by creating a whole new set of key/value pairs of the existing UTXOs.
(4) introduces a new code path from [view -> cache] and in turn introduces a whole new set of bugs.
The new code is different where the flow during headers-first (blocks up until the checkpoint) is:
This flow is much simpler and faster than the previous code.
For when the node is not in headers-first mode:
This flow still has the overhead of (2) but no longer has the new bugs introduced by applying the transaction connections from [view -> cache]. Instead, the view is only used as a temporary mutable map just for checkConnectBlock and the txs are connected in cache just like in headers-first mode.
No use of the cache during reorganizations
During reorganizations, a whole new possible bug was introduced where if the node is unexpectedly shutdown, the disconnected txs may only have been updated in the cache and not in the database. The new code forces a flush before the reorganization happens and keeps the same code as before, eliminating the new case of bugs introduced in #1373.
This allowed the removal of
TestUtxoCache_Reorg
and all the associated (and duplicated) code.No more utxoStateConsistency struct
It was enough to just save the block hash and not the single byte indicating if the cache is consistent or not. All the related code was removed.
Map slice instead of a singular map
The singular map doubles in size when the map gets full. For bigger maps, this may result in us going over the promised memory limit. The mapslice allows us to allocate multiple maps with differing sizes, allowing acute control of the memory allocated memory.
Another solution to this would be to just force the user to only allow fixed amounts of memory for the utxo cache (1GB, 2GB, 4GB, 8GB, etc). This would get rid of the code in
sizehelper.go
and mapslice related code inutxocache.go
but wouldn't allow the user flexibility. I'm ok with either route. This would get rid of 491 lines of code from being added.Known issue
During a flush, there's a large allocation of memory due to the fact that
database/ffldb
will hold onto every single bucket write to guarantee atomicity. It'll usually result in 2x the memory from what the cache is using. If the cache is using up 2GB, ffldb will allocate 2GB as well, resulting in 4GB of memory being taken up by btcd. I don't think there's any easy way around this other than making a different code path for utxo cache flushes.