RFC: ReverseSync - fetching historical data #224

cmwaters · 2020-11-27T12:29:47Z

This RFC addresses the need for Tendermint nodes to be able to fetch, verify and persist historical blocks and state data

Rendered

rfc/006-backfill-blocks.md

erikgrinaker

👍

rfc/006-backfill-blocks.md

ValarDragon · 2020-12-08T14:25:04Z

Is there an alternative name for this feature that can be used? "Filling block" suggests to me filling it with txs, in which case backfilling doesn't make sense.

Perhaps fetch historical blockdata?

cmwaters · 2020-12-09T09:12:09Z

@ValarDragon I think the initial coining of the term was to mean filling the gap between genesis and the nodes current base with blocks although I can also see how one might think it may mean going back and filling prior blocks with more txs. I'm not attached to the naming so I don't mind renaming this as: "fetch historical blockdata" if everyone else is in accordance

If we want something a little shorter and in alignment with the x-sync nomenclature then we could also call it "reversesync"

ValarDragon · 2020-12-09T15:38:34Z

I think reversesync is a great name as well

github-actions · 2021-01-09T01:04:25Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions · 2021-02-09T00:39:25Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

rfc/004-reverse-sync.md

alexanderbez · 2021-03-07T16:44:52Z

rfc/004-reverse-sync.md

+parameters as:
+
+```go
+max_historical_height = max(state.InitialHeight, state.LastBlockHeight - state.ConsensusParams.EvidenceAgeHeight)


Shouldn't this be min? I'm not sure what state.InitialHeight is, but what if that height is past the minimum bound for processing evidence?

the max just means we won't try to verify below the initial height (as no heights below initial height exist). It's merely a technicality

I also find naming confusing here. I guess you are trying to identify max block such that both block height and block time satisfy evidence age parameters as you have written below. It might be easier to understand if you move property from line 69 up front.

Yeah so I initially wasn't sure with which perspective to apply this. It might be intuitive to think of the minimum amount of headers that a node needs. But as this measure is in terms of height, I tended towards the highest height that nodes can prune to. I can change the nomenclature if we can think of a more intuitive way to describe it

rfc/004-reverse-sync.md

alexanderbez · 2021-03-07T16:53:38Z

rfc/004-reverse-sync.md

+
+In addition, Tendermint currently sends messages for entire blocks only. This
+would be inconvenient if we only wanted to acquire the header (although there
+may be benefits in saving the entire block). Hence we may want to consider


Yup, I think this is a valid point, but would reside in the blockchain reactor, no? If so, how do we tell the blockchain reactor to just get headers only during reverse syncing. Seems like it'll add to the complexity.

Perhaps we just keep the logic as-is and get the entire block, even though we only need the header.

Yeah we can decide this at a later date. Personally I feel splitting out the components of the block as individual messages makes sense for a few reasons:

Headers and Commits represent a fraction of the size of an entire block - we don't want to be sending large amount of data/load across the network that isn't actually used.

Division of the block aligns with the structure of the blockstore which stores commits, headers and the rest of the block in different compartments.

It aligns with the abstraction between verification and execution (at least in a delayed execution model) where headers and commits are used predominantly for verification and the txs in a block are used for execution

It aligns with the use case of state sync and light clients. If we want to integrate these components into the p2p layer, where these components only require headers and commits and not the entire block, then it makes sense that we divide these messages.

melekes · 2021-03-09T05:15:33Z

rfc/004-reverse-sync.md

+The aforementioned data is already available via the following RPC endpoints:
+`/blockchain` or `/commit` for `Header`'s' and `/validators` for
+`ValidatorSet`'s'. It may be the best option to fetch and verify these resources
+over RPC. Statesync already requires a list of RPC endpoints, so these could be


I think we should really consider Erik's light client reactor (P2P option below) - we already have it.

Yeah if we went with extending the p2p layer then I think Erik's approach would be the strongest candidate when it comes to the actual implementation.

Is there documentation on that?

There's this closed PR tendermint/tendermint#4508

That's about all the documentation we have on that

josef-widder · 2021-03-15T09:59:18Z

For the protocols I recently worked on, for

evidence handling
handling lightnode requests for headers

having a complete record (up to the trusting period in the past) of the headers is sufficient. Are there cases where we need the complete blocks?

In case we only need the headers, I agree that conceptually the problem is similar to backwards verification in the lightclient.

More generally, I can imagine that several design decisions in the past are based on the assumption that full nodes have a complete history. Did someone do a systematic review of the protocols where such an implicit assumption may be used?

milosevic · 2021-03-22T08:55:28Z

rfc/004-reverse-sync.md

+introduced some challenges of its own which were covered and subsequently
+tackled with [RFC-001](https://github.com/tendermint/spec/blob/master/rfc/001-block-retention.md).
+The RFC allowed applications to set a block retention height; an upper bound on
+what blocks would be pruned. However nodes who state sync past this upper bound


I don't really understand what you want to say here. Can you be a bit more precise about the motivating use case?

milosevic · 2021-03-22T08:58:03Z

rfc/004-reverse-sync.md

+tackled with [RFC-001](https://github.com/tendermint/spec/blob/master/rfc/001-block-retention.md).
+The RFC allowed applications to set a block retention height; an upper bound on
+what blocks would be pruned. However nodes who state sync past this upper bound
+(which is necessary as snapshots must be saved within the trusting period for


Is it correct saying that retention height and unbonding period are not correlated at all currently, i.e., we don't ensure that honest full nodes keep blocks during the unbonding period?

Yes, this is the crux of the problem

milosevic · 2021-03-22T09:00:22Z

rfc/004-reverse-sync.md

+
+For now, we focus purely on the case of a state syncing node, whom after
+syncing to a height will need to verify historical data in order to be capable
+of processing new blocks. We can denote the earliest height that the node will


I don't get this. Why you need to verify historical data (what do you mean by historical data here?) to be able to process new blocks? Do you mean here that before participating in consensus to order new blocks, you need to ensure you have enough historical data to be able to verify evidence?

Yup.

By historical data I mean Header and ValidatorSet. I explain this later on but for now I was intentionally trying to keep this high level to start with. I can add exactly what data is needed at the start if you think this is a better structure.

I mentioned that somewhere else. In general we should do a closer review about where in the current protocol we assume full nodes to have all past blocks. For instance, in Fastsync, I can also ask peers for blocks. If they don't have historical data they cannot respond. They might be removed by the syncing node as peer, although they are correct. I suspect this implicit assumption might be around in several places.

milosevic · 2021-03-22T09:14:53Z

rfc/004-reverse-sync.md

+`A.Height <= max_historical_height` and `A.Time <= max_historical_time`.
+
+Upon successfully reverse syncing, a node can now safely continue. As this
+feature is only used as part of state sync, one can think of this as merely an


Do we ensure that this property always hold, i.e., that retentionHeight does not lead to removing blocks needed to verify evidence?

Currently we don't, but this RFC would introduce one.

milosevic · 2021-03-22T09:16:38Z

rfc/004-reverse-sync.md

+extension to it.
+
+In the future we may want to extend this functionality to allow nodes to fetch
+historical blocks for reasons of accountability or data accessibility.


Fork accountability (if this is what you mean by accountability) operates in the same security model, so the same lower bound for needed block applies also in that case. What do you mean by data accessibility here?

One of the original motivations for ReverseSync was to take a node that had truncated history and to be able to turn it into an archive node. This is a node that has all the blocks from genesis height.

When pruning and state sync was introduced, a chief concern was that it incentivises nodes to keep only what they need. The most alarming result to this is that if every node truncated their history then all data below that height would be lost forever. New nodes would not be able to fast sync from genesis and no transactional records during that era would exist. Accountability and data accessibility refer to this problem - that the network should have a healthy amount of data available.

milosevic · 2021-03-22T09:22:15Z

rfc/004-reverse-sync.md

+
+ReverseSync is used to fetch and verify the following data structures:
+- `Header`
+- `ValidatorSet`


I am not sure to understand this. Given a trusted block B, you can use backward verification to verify previous blocks by following hash links in the past. Why you need to check valsets explicitly? Is it because of evidence handling which does not require access to a complete block? However, you still need more than just valset for evidence handling.

Checking the val sets has nothing to do with the backwards verification algorithm itself, it's more because we want validator sets as a necessary data structure when verifying evidence. Thus when a node sends us a validator set we need to check that we can trust it.

cmwaters · 2021-03-22T09:45:07Z

having a complete record (up to the trusting period in the past) of the headers is sufficient. Are there cases where we need the complete blocks?

An earlier revision of this RFC proposed to retrieve the complete block in order to give nodes the option to go from having a truncated history (i.e. when starting from state sync) to having full history. This idea was abandoned and I simplified the RFC to look purely at the problem where a node that has state synced doesn't have all the headers and validator sets within the unbonding period.

We need both headers and validator sets. Headers are used for verification and are used as the trusted header in the event of LightClientAttackEvidence and validator sets are used to confirm that the malicious validator is still bonded (within the evidence time).

More generally, I can imagine that several design decisions in the past are based on the assumption that full nodes have a complete history. Did someone do a systematic review of the protocols where such an implicit assumption may be used?

When pruning and starting at an initial height was introduced we tried as far as the implementation went to sweep the entire code for assumptions that were now invalid. I believe we got most of them but I still see some cases where the assumption that a node has full block history is made.

milosevic · 2021-03-22T10:00:29Z

rfc/004-reverse-sync.md

+for fast sync that also extends to ReverseSync is termination. ReverseSync will
+finish it's task when one of the following conditions have been met:
+
+1. It reaches a block `A` where `A.Height <= max_historical_height` and


What we use for as an upper bound of the unbonding period window, now or bft time of the state synced state? In the former case, we have a moving target for termination.

bft time of the state synced state

milosevic

Good work!I believe there are still few open points that should be clarified/agreed upon before we can merge this. More particularly, it would be good if there is more clarity what exactly we want to achieve with reversesync (my understanding is supporting evidence handling is the main use case), and how we want to do it can be left for ADR.

cmwaters · 2021-03-22T10:45:18Z

I just did another pass through of the evidence logic. It seems that we will also need the Commit. This means that reverse sync needs to be able to fetch and verify:

Header
Commit
Validator Set

it would be good if there is more clarity what exactly we want to achieve with reversesync (my understanding is supporting evidence handling is the main use case)

This is the only use case at the moment.

Prior we thought about adding it as a tool to turn a node with truncated history to become an archive node with full history but we decided to simplify the proposal. (See #224 (comment)).

Given that we narrowed the scope of this RFC, then perhaps we don't need to call it something like reverse sync as it will just be an additional component of state sync

cmwaters · 2021-04-19T08:30:50Z

I have polished up this RFC to include some of the discussions held at the dev call a few weeks back where we decided to use the p2p layer and thus introduce two new messages.

I will look to merge this by the end of the day in case anyone wants to look over it one last time

cmwaters added 3 commits November 25, 2020 20:19

backfill initial draft

209868b

complete initial draft

dab66f9

promote adding channels to the blockchain reactor

29c1ea2

cmwaters marked this pull request as ready for review November 30, 2020 13:36

cmwaters requested review from ebuchman, josef-widder, konnov and milosevic as code owners November 30, 2020 13:36

erikgrinaker reviewed Dec 1, 2020

View reviewed changes

tac0turtle reviewed Dec 1, 2020

View reviewed changes

rfc/006-backfill-blocks.md Outdated Show resolved Hide resolved

tac0turtle reviewed Dec 1, 2020

View reviewed changes

rfc/006-backfill-blocks.md Outdated Show resolved Hide resolved

tac0turtle reviewed Dec 1, 2020

View reviewed changes

rfc/006-backfill-blocks.md Outdated Show resolved Hide resolved

cmwaters added 2 commits December 1, 2020 14:54

clean up implementation specific details and add suggested content

aed6aa3

rewrite proposal

b65d663

tac0turtle reviewed Dec 2, 2020

View reviewed changes

rfc/006-backfill-blocks.md Outdated Show resolved Hide resolved

cmwaters self-assigned this Dec 3, 2020

erikgrinaker reviewed Dec 3, 2020

View reviewed changes

incorporate suggestions and make minor modifications

bf51ab2

github-actions bot added the Stale label Jan 9, 2021

tessr removed the Stale label Jan 9, 2021

github-actions bot added the Stale label Feb 9, 2021

cmwaters removed the Stale label Feb 9, 2021

cmwaters marked this pull request as draft February 17, 2021 11:22

cmwaters added 2 commits February 17, 2021 12:39

update reverse sync rfc

a020c1c

reformat table of contents

9149538

alexanderbez reviewed Mar 2, 2021

View reviewed changes

rfc/004-reverse-sync.md Outdated Show resolved Hide resolved

rfc/004-reverse-sync.md Outdated Show resolved Hide resolved

rfc/004-reverse-sync.md Outdated Show resolved Hide resolved

simplify to focus on the case of state sync

a140f48

cmwaters force-pushed the callum/backfill-rfc branch from 2605476 to a140f48 Compare March 3, 2021 18:22

cmwaters requested review from alexanderbez, melekes and tac0turtle March 3, 2021 18:33

fix headers

bc7047a

alexanderbez reviewed Mar 7, 2021

View reviewed changes

rfc/004-reverse-sync.md Outdated Show resolved Hide resolved

alexanderbez reviewed Mar 7, 2021

View reviewed changes

melekes reviewed Mar 9, 2021

View reviewed changes

make a few minor improvements

9458e3a

milosevic reviewed Mar 22, 2021

View reviewed changes

milosevic suggested changes Mar 22, 2021

View reviewed changes

milosevic approved these changes Apr 1, 2021

View reviewed changes

cmwaters added 2 commits April 19, 2021 09:52

Merge branch 'master' into callum/backfill-rfc

b11f5fb

revise reverse sync rfc

c20dea9

cmwaters merged commit b39af91 into master Apr 19, 2021

cmwaters deleted the callum/backfill-rfc branch April 19, 2021 15:02

cmwaters mentioned this pull request Apr 19, 2021

p2p: update state sync messages for reverse sync #285

Merged

RFC: ReverseSync - fetching historical data #224

RFC: ReverseSync - fetching historical data #224

Conversation

cmwaters commented Nov 27, 2020 • edited Loading

erikgrinaker left a comment

Choose a reason for hiding this comment

ValarDragon commented Dec 8, 2020

cmwaters commented Dec 9, 2020

ValarDragon commented Dec 9, 2020

github-actions bot commented Jan 9, 2021

github-actions bot commented Feb 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmwaters Mar 9, 2021 • edited Loading

Choose a reason for hiding this comment

melekes Mar 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josef-widder commented Mar 15, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

milosevic Mar 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

milosevic Mar 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmwaters commented Mar 22, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

milosevic left a comment

Choose a reason for hiding this comment

cmwaters commented Mar 22, 2021

cmwaters commented Apr 19, 2021

cmwaters commented Nov 27, 2020 •

edited

Loading

cmwaters Mar 9, 2021 •

edited

Loading

melekes Mar 9, 2021 •

edited

Loading

milosevic Mar 22, 2021 •

edited

Loading

milosevic Mar 22, 2021 •

edited

Loading