-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
What does "chunks" mean here? You want to reconstruct the current state from only recent blocks? Is that possible? Are you talking about not a sync from nothing, but only a partial sync from a previously running node, which only grabs the parts of the blocks with state updates? |
@burdges This implements a sync from scratch for the full state. Instead of executing all blocks and building the state locally, we just download a single full state, split into chunks. These are introduced at the network protocol level to prevent downloading too much unverified data. The network protocol works like this: The syncing node requests a sequence of trie keys for a recent block |
We do not care about what changed recently right? We're just iterating over the whole state. I confused this with another approach using an optimization for partially synced nodes: We first download the PoV data for a recent block B and update our previous state, checking all Merkle proofs. We also iterate over all Merkle proof copath elements c in the PoV block, if we knew c already in our previous state then we're done there, but if c is unfamiliar then we request a proof for c. At this point, the server has collected proof requests c so it loops over ancestors B' = B-1,B-2,..,0 of B, sends the proofs for any c requested in B', and aborts when no requested c remain. We then perform the same state update and request a new bunch of c We cannot afaik do this per se because nobody knows old blocks or which state elements they changed. It adapts to partial history like we possess by sending all leaves under c whenever no more precise change history exists. I kinda doubt we require such optimizations for the partially synced case though, so likely what you're doing here makes the most sense, but they exist fyi. |
At some point, we should explore an erasure coded version of this fast sync, so once per day/week parathread pause state updates and produce blocks that scan through their whole state, much like what you're doing here but on-chain. All these blocks get erasure coded into the relay chain's availability store, so not even if all parathread nodes go offline then another parathread node could start up and pick up from the previous erasure coded state. I think your version here sounds best when the chain trusts its own nodes, like our relay chain. Yet, an erasure coded on-chain version like this helps run long lived system parathreads, upon whose continuation polkadot itself depends. |
The use case here is to get the relay chain on a fresh start up to the latest full state as quickly as possible. This is also required for storage chains, where nodes don't even keep the old blocks.
Right, although not at once. Iteration is split over multiple requests and peers, so no single peer has to iterate the whole state.
This may be prohibitively slow for a large chain. E.g. on Ethereum iterating over the whole state trie takes hours. For partially synced states another idea is to use some kind of set reconciliation protocol for the trie nodes. |
Fine. We mostly want this for parathreads that either operate in massively sharded pool or else move specific relay chain functionality off the relay chain. We'd need to balance performance constraints when selecting pool characteristics or similar. Another day.. :)
We'd scan down the tree from the root here I guess, doing set reconciliation at each level. I described a scheme above that basically does set reconciliation using block numbers, up until we've pruned the blocks. We can explore this stuff if you think we need faster partial sync than simply scanning the whole state. I donno.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first read of the pr. Nice to see it.
Most useful point from my review is that I think we should not attach keys with proofs and just check iteration for proof (we gain size and avoid crafted bad response).
At this point I was as also wondering about flagging as experimental, not sure it is worth it? (I did not look for issue related to switching from block synching to fastsynching or the other way, or possible side effect with pruning or other unexpected race, so I am not overtly confident, but that's on my side).
primitives/state-machine/src/lib.rs
Outdated
let mut current_key = start.as_ref().to_vec(); | ||
let mut entries_size = 0; | ||
let mut keys = Vec::new(); | ||
let proving_backend = proving_backend::ProvingBackend::<_, H>::new(trie_backend); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also use a trie iterator as in https://github.com/cheme/substrate/blob/cb792cf93e804e091f7a69b5bb1ac67fbfee3408/primitives/state-machine/src/lib.rs#L808 (actually next_storage key does use a trie iterator internally).
But seems correct to use it this way to (proof would be identical).
primitives/state-machine/src/lib.rs
Outdated
entries_size += next_key.len(); | ||
keys.push(next_key.clone()); | ||
let proof_size = proving_backend.estimate_encoded_size(); | ||
if entries_size + proof_size > size_limit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, if removing key from protocol, only proof_size is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 in general for doing this.
However, assuming that this PR is merged, and that nodes start using this to sync, then after a while the number of nodes that have block number 1 stored in their database will be lower and lower.
This isn't necessarily a problem in the absolute, but if you were to try sync from scratch you might have trouble finding a node that has block number 1 (and above). Right now we just assume that all full nodes know about all blocks, but in the future either we just don't care about people trying to sync the chain from scratch or we need something a bit smarter.
Ideally, after syncing the state, there should be a background process that downloads old block bodies. I'll change this PR to request and store block bodies (but not execute them) along with block headers during the first phase of the sync for now. It will still be fast, although require more bandwidth. But if this is combined with grandpa warp sync, where we don't even have all the headers, we'll either need a post-sync background download, or we'll have to rely on bootnodes or archive nodes to preserve full history. I should also note that in case of storage chains, this will be the only way to sync. Block 1 is supposed to be be forgotten by design. |
The changes I made are in |
223 MiB Compact vs 246Mib Full. Could you maybe make a follow-up PR after this is merged? This one already has too many changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the huge delay :(
pub enum SyncMode { | ||
Full, | ||
Fast, | ||
FastUnsafe, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it doesn't support doc comments, you can just add some code comment explaining what this does.
@@ -125,6 +126,10 @@ pub struct NetworkParams { | |||
/// Join the IPFS network and serve transactions over bitswap protocol. | |||
#[structopt(long)] | |||
pub ipfs_server: bool, | |||
|
|||
/// Blockchain syncing mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or add some docs here about the variants. So, that the user can understand them
client/consensus/babe/src/lib.rs
Outdated
@@ -1275,7 +1277,12 @@ impl<Block, Client, Inner> BlockImport<Block> for BabeBlockImport<Block, Client, | |||
// early exit if block already in chain, otherwise the check for | |||
// epoch changes will error when trying to re-import an epoch change | |||
match self.client.status(BlockId::Hash(hash)) { | |||
Ok(sp_blockchain::BlockStatus::InChain) => return Ok(ImportResult::AlreadyInChain), | |||
Ok(sp_blockchain::BlockStatus::InChain) => { | |||
// When re-importing existing block strip away finality information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finality information?
client/db/src/lib.rs
Outdated
storage: Storage, | ||
) -> ClientResult<Block::Hash> { | ||
if storage.top.keys().any(|k| well_known_keys::is_child_storage_key(&k)) { | ||
return Err(sp_blockchain::Error::GenesisInvalid.into()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we rename this function? Below you also speak about genesis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh now I see, copy and paste :D So, yeah, we should remove the word "genesis" from this function.
backend.blockchain.update_meta(MetaUpdate { | ||
hash: info.finalized_hash, | ||
number: info.finalized_number, | ||
is_best: info.finalized_hash == info.best_hash, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the requirement for this finalized == best_hash
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finalized
is the block we are writing metadata for. So if it is also the best block, is_best
is set to true
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
bot merge |
Trying merge. |
* State sync * Importing state fixes * Bugfixes * Sync with proof * Status reporting * Unsafe sync mode * Sync test * Cleanup * Apply suggestions from code review Co-authored-by: cheme <[email protected]> Co-authored-by: Pierre Krieger <[email protected]> * set_genesis_storage * Extract keys from range proof * Detect iter completion * Download and import bodies with fast sync * Replaced meta updates tuple with a struct * Fixed reverting finalized state * Reverted timeout * Typo * Doc * Doc * Fixed light client test * Fixed error handling * Tweaks * More UpdateMeta changes * Rename convert_transaction * Apply suggestions from code review Co-authored-by: Bastian Köcher <[email protected]> * Apply suggestions from code review Co-authored-by: Bastian Köcher <[email protected]> * Code review suggestions * Fixed count handling Co-authored-by: cheme <[email protected]> Co-authored-by: Pierre Krieger <[email protected]> Co-authored-by: Bastian Köcher <[email protected]>
Introduces fast sync mode which works as follows:
Current limitations:
"Could not fetch epoch .."
error. This can can probably be fixed by fetching epoch info from the state, or skipping epoch verification for existing headers altogether.polkadot companion: paritytech/polkadot#3078