Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LRU Cache for Boxes/KVs #4275

Merged
merged 23 commits into from
Aug 18, 2022
Merged

Conversation

algoidurovic
Copy link
Contributor

@algoidurovic algoidurovic commented Jul 20, 2022

Summary

This KV cache is intended to nearly exactly replicate the resources cache, in lieu of generics. Included are also the box database benchmark tests that were the original motivation for implementing a cache. One of the size limits of the cache (baseKVPendingAccountsBufferSize = 50,000) is informed by the maximum box size (~32KB) and the baseResourcesPendingAccountsBufferSize = 100,000 where the maximum resource size is ~16KB. The other limitation on cache size is len(au.kvStore): the number of KV modifications in the state deltas. While the theoretical maximum memory usage is quite high, this is an existing problem with the current cache implementations and should be addressed in another PR.

Test Plan

The cache and underlying data structures are tested in unit tests.

@algoidurovic algoidurovic requested a review from jannotti July 20, 2022 16:50
@@ -2567,12 +2576,7 @@ func (qs *accountsDbQueries) listCreatables(maxIdx basics.CreatableIndex, maxRes
return
}

type persistedValue struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just renamed and moved this but I don't mind undoing it if there's an objection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algoidurovic I think the rename touches on a central question: What is box-specific about the cache?

I realize the question prompts a potentially large renaming exercise, but I'm wondering if the cache ought to be kv themed rather than box themed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think it should. Anything below the ledger tries to talk about kv rather than about boxes. But I don't think it changes much code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algoidurovic From my perspective, based on renames to KV, the thread is now resolved. Since I didn't open it, I'm leaving it open for you to resolve.

@@ -387,7 +397,13 @@ func (au *accountUpdates) lookupKv(rnd basics.Round, key string, synchronized bo

persistedData, err := au.accountsq.lookupKeyValue(key)
if persistedData.round == currentDbRound {
return persistedData.value, nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error handling has been slightly changed here, but seems innocent enough to me.

Copy link
Contributor

@ahangsu ahangsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some random questions, the implementation looks following previous implementation patterns, should be legit

@codecov
Copy link

codecov bot commented Aug 1, 2022

Codecov Report

Merging #4275 (24ff56e) into feature/avm-box (66f8e1c) will increase coverage by 0.01%.
The diff coverage is 87.25%.

@@                 Coverage Diff                 @@
##           feature/avm-box    #4275      +/-   ##
===================================================
+ Coverage            55.33%   55.35%   +0.01%     
===================================================
  Files                  403      405       +2     
  Lines                51406    51505      +99     
===================================================
+ Hits                 28448    28510      +62     
- Misses               20529    20558      +29     
- Partials              2429     2437       +8     
Impacted Files Coverage Δ
ledger/persistedresources_list.go 92.00% <ø> (ø)
ledger/tracker.go 73.93% <ø> (ø)
ledger/acctupdates.go 65.16% <50.00%> (-0.32%) ⬇️
ledger/persistedkvs.go 92.00% <92.00%> (ø)
ledger/lrukv.go 93.75% <93.75%> (ø)
ledger/accountdb.go 71.72% <100.00%> (+0.04%) ⬆️
network/wsPeer.go 65.47% <0.00%> (-2.20%) ⬇️
crypto/merkletrie/trie.go 66.42% <0.00%> (-2.19%) ⬇️
crypto/merkletrie/node.go 91.62% <0.00%> (-1.87%) ⬇️
catchup/service.go 68.14% <0.00%> (-1.24%) ⬇️
... and 2 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@jannotti jannotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly through, need to review the lruboxes.go file by diffing it with the other caches.

But I want to repeat my agreement with Michael. All these things should use "key value" language, not box language. At this level, they are key/value pairs, and it so happens they only have box data in them. But the lookup is for a key/value pair, not a box.

@@ -68,6 +68,15 @@ const baseResourcesPendingAccountsBufferSize = 100000
// is being flushed into the main base resources cache.
const baseResourcesPendingAccountsWarnThreshold = 85000

// baseBoxesPendingBufferSize defines the size of the base boxes pending buffer size.
// At the beginning of a new round, the entries from this buffer are being flushed into the base boxes map.
const baseBoxesPendingBufferSize = 5000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algoidurovic Can the inline comment be expanded (or a new one added) to summarize why we chose the configured value?

I realize the value is liable to change prior to merge. Consider the note a visual reminder for when we're ready to finalize the value.

@algoidurovic algoidurovic changed the title LRU Cache for Boxes LRU Cache for Boxes/KVs Aug 10, 2022
Copy link
Contributor

@michaeldiamant michaeldiamant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algoidurovic Thanks for your effort to implement + test a KV-specific cache implementation!

Approving with the understanding that we'll finalize configuration of baseKVPendingBufferSize out-of-band to the PR.

Copy link
Contributor

@jannotti jannotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback, I'm going to go pull this down locally and diff the lru now, since github review isn't good for that.

@@ -4809,6 +4816,12 @@ func (prd *persistedResourcesData) before(other *persistedResourcesData) bool {
return prd.round < other.round
}

// before compares the round numbers of two persistedKVData and determines if the current persistedKVData
// happened before the other.
func (prd *persistedKVData) before(other *persistedKVData) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since they are small objects, I might not use pointers for the receiver or the arg.

@@ -206,6 +215,9 @@ type accountUpdates struct {
// baseResources stores the most recently used resources, at exactly dbRound
baseResources lruResources

// baseKV stores the most recently used KV, at exactly dbRound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call it either baseKVs or baseKVStore, as it is a bunch of Key/Values. (right?)

@@ -387,7 +406,15 @@ func (au *accountUpdates) lookupKv(rnd basics.Round, key string, synchronized bo

persistedData, err := au.accountsq.lookupKeyValue(key)
if persistedData.round == currentDbRound {
return persistedData.value, nil
if err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pavel recently made a PR to change the error handling on the similar functions that would be good to copy here.

#4425

@@ -1300,7 +1335,7 @@ func (au *accountUpdates) lookupResource(rnd basics.Round, addr basics.Address,
// against the database.
persistedData, err = au.accountsq.lookupResources(addr, aidx, ctype)
if persistedData.round == currentDbRound {
if persistedData.addrid != 0 {
if err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this PR changing lookupResource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I don't know how that got in there.

// where persistedData.value == nil to avoid unnecessary db lookups
// for deleted KVs.
au.baseKVs.writePending(persistedData, key)
return persistedData.value, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that err is checked earlier, let's make this explicitly nil so it is obvious it's the good path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, fixed.

@algoidurovic algoidurovic merged commit 31037ed into algorand:feature/avm-box Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants