Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logical segregation of blockstores + freecache-cached chain and state blockstores #4771

Closed
wants to merge 14 commits into from

Conversation

raulk
Copy link
Member

@raulk raulk commented Nov 9, 2020

Fixes #4752.

This PR segregates the monolithic blockstore into two logical blockstores: the chain and the state blockstore.

Currently they're backed by the same physical store, but the abstractions introduced here allow us to front the chain and the state blockstores by different caches, each tuned for the specific data access pattern of its nature. The also pave the way for upcoming physical segregation.

The caching is based on Freecache, a near LRU cache. Given our access patterns, pure LRU is estimated to behave better than ARC/2Q. We use a fork of Freecache (maintained by me) that enables zero-copy access to cached values. The total footprint of the caching layer is 416MiB.

The ARC cache is no longer used, and the deprecated CachingBlockstore has been removed.

This PR also introduces a LotusBlockstore interface. It is the union of the Blockstore and Viewer interfaces, and is required by the chain and state blockstores. This way, we remove the optionality of Viewer and simplify the code.

This PR also removes block size caching (Lotus never calls blockstore.GetSize).

Benchmark

  1. Imported minimal_finality_stateroots_latest.car
  2. Ran for 1000 block validations.
  3. Took a metrics dump.

Note that these comparisons put the cached version at a disadvantage, because the store is still light. As the store grows, the benefits of the cache are much higher.

Hit rate

lotus_blockstore_cache_hit_ratio{cache_name="chain_block_cache"} 0.8236359865598997
lotus_blockstore_cache_hit_ratio{cache_name="chain_exists_cache"} 0.789787058767283
lotus_blockstore_cache_hit_ratio{cache_name="state_block_cache"} 0.6361207867057612
lotus_blockstore_cache_hit_ratio{cache_name="state_exists_cache"} 0.765091942682962

Block validation times

Before

Average: 1470,861553784860558 ms (91% validated under 2000ms)

...
lotus_block_validation_ms_bucket{le="200"} 0
lotus_block_validation_ms_bucket{le="250"} 0
lotus_block_validation_ms_bucket{le="300"} 0
lotus_block_validation_ms_bucket{le="400"} 0
lotus_block_validation_ms_bucket{le="500"} 0
lotus_block_validation_ms_bucket{le="650"} 0
lotus_block_validation_ms_bucket{le="800"} 47
lotus_block_validation_ms_bucket{le="1000"} 129
lotus_block_validation_ms_bucket{le="2000"} 923
lotus_block_validation_ms_bucket{le="5000"} 1001
lotus_block_validation_ms_bucket{le="10000"} 1001
lotus_block_validation_ms_bucket{le="20000"} 1004
lotus_block_validation_ms_bucket{le="50000"} 1004
lotus_block_validation_ms_bucket{le="100000"} 1004
lotus_block_validation_ms_bucket{le="+Inf"} 1004
lotus_block_validation_ms_sum 1.4767458445909997e+06
lotus_block_validation_ms_count 1004 

After

As you can see, the distribution has changed. Average: 1266,592261904761905ms (-14%) (95% validated under 2000ms).

lotus_block_validation_ms_bucket{le="300"} 0
lotus_block_validation_ms_bucket{le="400"} 0
lotus_block_validation_ms_bucket{le="500"} 8
lotus_block_validation_ms_bucket{le="650"} 61
lotus_block_validation_ms_bucket{le="800"} 144
lotus_block_validation_ms_bucket{le="1000"} 323
lotus_block_validation_ms_bucket{le="2000"} 956
lotus_block_validation_ms_bucket{le="3000"} 1003
lotus_block_validation_ms_bucket{le="4000"} 1003
lotus_block_validation_ms_bucket{le="5000"} 1003
lotus_block_validation_ms_bucket{le="7500"} 1003
lotus_block_validation_ms_bucket{le="10000"} 1003
lotus_block_validation_ms_bucket{le="20000"} 1008
lotus_block_validation_ms_bucket{le="50000"} 1008
lotus_block_validation_ms_bucket{le="100000"} 1008
lotus_block_validation_ms_bucket{le="+Inf"} 1008
lotus_block_validation_ms_sum 1.2767256901239988e+06
lotus_block_validation_ms_count 1008

GC/allocs stats

Before

go_gc_duration_seconds_sum 0.044846669
go_gc_duration_seconds_count 159
lotus_process_total_memory_alloc 8.691073448e+10
process_resident_memory_bytes 9.098326016e+09

After

go_gc_duration_seconds_sum 0.013976411
go_gc_duration_seconds_count 86
lotus_process_total_memory_alloc 7.444904292e+10
process_resident_memory_bytes 9.624408064e+09

TODO

  • Unit tests for the cache.
  • A final round of live testing.

@raulk raulk force-pushed the caching-blockstore branch from 4d38e94 to 3ef260f Compare November 9, 2020 11:56
@raulk raulk changed the title ristretto-cached blockstore freecache-cached blockstore Nov 10, 2020
@raulk raulk force-pushed the caching-blockstore branch 2 times, most recently from c383524 to fe01fb4 Compare November 13, 2020 18:37
This commit introduces FreecacheCachingBlockstore, a caching-façade
for blockstores. It can be mounted on any blockstore using
blockstore.WrapFreecacheCache.

We use a fork of freecache that supports zero-copy value
access.
This commit introduces ChainBlockstore and StateBlockstore,
two logical abstractions on top of the current monolith blockstore.

Each of them is backed by a different cache configuration, that has
been picked through experimentation, leading to ~70-80% hit ratios
during catch-up sync. I believe these hit ratios to extrapolate to
live sync.

The ARC caches have been removed. The footprint of the new caches is
432MiB.

The underlying store is exposed as BareMonolithBlockstore. This store
is now used in external-facing components via ExposedBlockstore:
Bitswap, Graphsync, and JSON-RPC. This increases security, such that
external actors cannot influence in our caching decisions.

This commit also removes the optionality of the Viewer interface on
state and chain blockstores. The new type blockstore.LotusBlockstore
is the union of Blockstore + Viewer, and is now required in most places.
@raulk raulk force-pushed the caching-blockstore branch from c21da23 to d0b5d66 Compare November 15, 2020 22:26
@raulk raulk changed the title freecache-cached blockstore logical segregation of blockstores + freecache-cached chain and state blockstores Nov 15, 2020
@raulk raulk force-pushed the caching-blockstore branch from 7b42f33 to 587851a Compare November 15, 2020 22:46
Comment on lines 49 to 55
// LotusBlockstore is a standard blockstore enhanced with a view operaiton
// (zero-copy access to values), and potentially with cache management
// operations, or others, in the future.
type LotusBlockstore interface {
Blockstore
Viewer
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we drop the alias below, and call this Blockstore? Would let us avoid all the renaming everywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I did initially, but I reverted because it causes ambiguity in those places that still take a standard Blockstore (e.g. Graphsync and others). We would end up with two blockstore.Blockstore interfaces that are different because one embeds the other...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still alias that blockstore here, just as something like BasicBlockstore (to relay the fact that the non-viewer blockstore is not the one that should be used by default).

Copy link
Member Author

@raulk raulk Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 66336b9.

@raulk raulk force-pushed the caching-blockstore branch from eae3b9a to 4dadb11 Compare November 16, 2020 16:51
@raulk raulk force-pushed the caching-blockstore branch from 4dadb11 to 672669e Compare November 16, 2020 17:25
@raulk raulk marked this pull request as ready for review November 16, 2020 19:44
@raulk raulk requested a review from magik6k November 16, 2020 22:41
Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass looks ok, but there are bound to be some bugz in this monster...

lib/blockstore/cache_freecache.go Show resolved Hide resolved
@vyzo
Copy link
Contributor

vyzo commented Nov 30, 2020

@raulk this has developed conflicts...

@raulk
Copy link
Member Author

raulk commented Dec 3, 2020

@vyzo they're tiny and I believe unrelated to the work you're doing. Let's resolve them when we have something to merge.

@vyzo
Copy link
Contributor

vyzo commented Jan 28, 2021

absorbed into #4992

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chain/state store improvements: introduce a block cache and optimise existing 'has' cache
3 participants