Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

feat: blockstore cache #15

Merged
merged 5 commits into from
Feb 10, 2023
Merged

feat: blockstore cache #15

merged 5 commits into from
Feb 10, 2023

Conversation

aschmahmann
Copy link
Contributor

Wanted to demonstrate a global LRU (in memory) blockstore here. This is one of many possible options.

It also shows how we could use an Exchange Interface rather than a blockstore one for Caboose which would mean that we won't have problems with Cache-Control: only-if-cached as we do now.

This is a global LRU (well 2Q which tracks frequency and recently used items). If we want a per-request one we can do that too, by making the exchange one that supports sessions and starting a new session for every request. If we did that we'd probably pass an empty blockstore to the blockservice so that Cache-Control: only-if-cached always fails but everything else still works.

@hacdias @lidel @willscott

@willscott
Copy link
Collaborator

(I'll move the per-request version over here as well once i can make branches. waiting on ipfs/github-mgmt#113)

@hacdias hacdias force-pushed the initial-iteration branch 4 times, most recently from a701342 to c912625 Compare February 7, 2023 15:08
Base automatically changed from initial-iteration to main February 8, 2023 04:02
@hacdias hacdias force-pushed the feat/lru-blockstore branch from c797c29 to 86b0ded Compare February 8, 2023 10:31
@ipfs-inactive ipfs-inactive deleted a comment from welcome bot Feb 8, 2023
@hacdias hacdias force-pushed the feat/lru-blockstore branch from 86b0ded to 35cc531 Compare February 8, 2023 10:34
@lidel
Copy link
Collaborator

lidel commented Feb 9, 2023

@hacdias mind taking a look at pushing this over the finish line to help with #21?

Multiple HAMT blocks need to be fetched every time before we can render bafybeiggvykl7skb2ndlmacg2k5modvudocffxjesexlod2pfvg5yhwrqm or find the CID of /ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/wiki/index.html, so any cache would help at this point :) we can add per-request one later.

While at it, add metrics that track how efficient the 2Q cache is (perhaps start with HIT vs MISS ratio based on TwoQueueCache.Contains?). Thoughts welcome on what we could measure here to inform next steps.

@aschmahmann
Copy link
Contributor Author

Thoughts welcome on what we could measure here to inform next steps.

Measuring the number of hits we get on the cache (per unit of time) is definitely a good start. Additionally it might be useful to have a histogram of the sizes of the elements that are hits vs misses.

This whole cache is in memory and has a fixed size, we can jack up the size to whatever seems appropriate. However, unfortunately the cache is based on number of elements rather than size which means we might have to be more conservative than we'd want here (max block size is 2MiB if we want to be very careful, although we don't have to be).

@hacdias hacdias force-pushed the feat/lru-blockstore branch 2 times, most recently from f9d722a to 22bdcad Compare February 9, 2023 14:43
@hacdias
Copy link
Collaborator

hacdias commented Feb 9, 2023

I added two metrics: one for the cache hits and one for the cache requests. This should allow us to calculate the ratio over time using Grafana or something else. I'm unsure if NotifyNewBlocks needs to be implemented. Also kept the name generic without mentioning the LRU since the cache type may change in the future. I think we perhaps could merge this and iterate on improving it.

@hacdias hacdias requested a review from lidel February 9, 2023 14:45
@hacdias hacdias marked this pull request as ready for review February 9, 2023 14:45
@hacdias hacdias changed the title feat: lru blockstore feat: blockstore cache Feb 9, 2023
@hacdias hacdias force-pushed the feat/lru-blockstore branch from 22bdcad to 27ae51f Compare February 9, 2023 15:12
Copy link
Collaborator

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clicked around /ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/wiki/ a bit and cache hit rate looks promising:

ipfs_http_blockstore_cache_hit 2711
ipfs_http_blockstore_cache_requests 4082

@aschmahmann tried to understand the purpose of NotifyNewBlocks but docs/examples are scarce. Is it used only in the context of making blocks added to local blockstore via ipfs add available to ongoing bitswap sessions(?)

blockstore_cache.go Outdated Show resolved Hide resolved
@hacdias hacdias force-pushed the feat/lru-blockstore branch from 52df4bc to f1f69d6 Compare February 10, 2023 10:14
@hacdias hacdias force-pushed the feat/lru-blockstore branch from f1f69d6 to 8505475 Compare February 10, 2023 11:52
@hacdias hacdias force-pushed the feat/lru-blockstore branch from 8505475 to 97b8ea5 Compare February 10, 2023 11:53
@hacdias
Copy link
Collaborator

hacdias commented Feb 10, 2023

Since this PR is including other things, please do Rebase and Merge instead of Squash to preserve history.

blockstore.go Outdated
Comment on lines 52 to 55
func (e *exchangeBsWrapper) NotifyNewBlocks(ctx context.Context, blks ...blocks.Block) error {
//TODO implement me
return nil
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried to understand the purpose of NotifyNewBlocks but docs/examples are scarce. Is it used only in the context of making blocks added to local blockstore via ipfs add available to ongoing bitswap sessions(?)

It should be fine to just return nil here.

IIUC NotifyNewBlocks has two points:

  1. Let the server know it has blocks that are available to be served (i.e. because if you ask a Bitswap server for a block and it later receives it it can send it to you without you asking again). These blocks could come from an exchange fetch (i.e. go-bitswap does not do Puts into the blockstore itself that happens in go-blockservice) or from an external source (e.g. ipfs add).
  2. Let the client know that a block has arrived that it was otherwise waiting for
    • This doesn't really need to exist, but if the exchange was waiting for a block that came in from somewhere else (e.g. a CAR file import) then the call would either need to be cancelled + handled by a higher layer, like the blockservice, or the block would need to get threaded through the call stack. This enables the latter choice.
    • I suspect this case being covered is more an accident of needing to handle the first one and go-bitswap historically combining client and server elements

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: The reason we can return nil here isn't so much that it's a good idea or anything but that this whole wrapper shouldn't exist as caboose should just be an exchange. In that context, caboose would need to decide whether it wants to be smart enough to deal with what happens if you ask for the same block twice close together.

My suspicion is that they won't care because further down the road we're going to be dealing with graphs rather than blocks here anyway.

Copy link
Collaborator

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I've added optional --block-cache-size in case anyone wants to play with bigger values without recompiling. If not passed, we use 1024 as the default for now.

@lidel lidel merged commit 42dd3fb into main Feb 10, 2023
@lidel lidel deleted the feat/lru-blockstore branch February 10, 2023 21:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants