-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move to native badger blockstore; leverage zero-copy View() to deserialize in-place #4681
Conversation
node/repo/blockstore_opts.go
Outdated
// Reduce this from 64MiB to 16MiB. That means badger will hold on to | ||
// 20MiB by default instead of 80MiB. This does not appear to have a | ||
// significant performance hit. | ||
opts.MaxTableSize = 16 << 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with this is the number of TableFiles that will be created. Lotus for a while now run with default from badger (64M) instead of 20M because of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our badger2 datastore adapter set this value here:
But it turns out we take those defaults and then entirely override them here:
Line 23 in 6cdbf52
opts.Options = dgbadger.DefaultOptions("").WithTruncate(true). |
This also means that we lose other defaults that made sense, e.g. ValueLogLoadingMode = FileIO instead of mmap, because we reverted to badger defaults:
https://pkg.go.dev/github.com/dgraph-io/badger#Options.WithValueLogLoadingMode
This could explain some things...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored the MaxTableSize value here: 0b8a21e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find the ValueLogLoadingMode = FileIO
having any benefit (and the results that suggested it are super old) so unless you have some results confirminig it, I would stay with dgraph/badger defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileIO
sabotages the benefits of View()
, so we're back to mmap.
02afe46
to
842c8ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGWM but lack of support of View from CachingBS and IDWrapperBS will stop us from using the View.
@Kubuxu is right, and this is evident in profiles. This PR is incomplete until I add the |
0182d1b
to
38c404e
Compare
bs.chargeGas(bs.pricelist.OnIpldGet()) | ||
return v.View(c, func(b []byte) error { | ||
// we have successfully retrieved the value; charge for it, even if the user-provided function fails. | ||
bs.chargeGas(newGasCharge("OnIpldViewEnd", 0, 0).WithExtra(len(b))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two are only signaling charges, they have no consensus importantce so it is fine if they are inside the View.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just 1 comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved magiks suggestion
This PR introduces a native badger blockstore that bypasses the datastore abstraction, thus removing layers of indirection, overhead, and allocs from one of the hottest paths in the system.
The native badger blockstore implements the
View
method, which allows us to directly access an mmap value to deserialize in-place, thus avoiding unnecessary byte slice allocs and copies. See ipfs/go-ipld-cbor#75 for related change.All usages of the chain datastore have been updated to use the chain blockstore. Low-level commands like state-prune have been adjusted to operate on badger directly.
TODO/decide
Future: do we want to separate the state and the blockstores?