-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design block-by-block API #1245
Comments
Thanks for this proposal @evanlinjin. I don't agree with the requirements.
This is not an optimization we require. Even when you do every single block it's around 19mb a year on disk. In 10 years 190mb. Consumer devices like phones will likely have quadruple their current storage by then. If you want to have a constant size This keeps things simple.
I think you'll agree with restating this like this but I want to check: The sole responsibility of the block source should be to emit blocks in topological order along with a block that it is connected to that it has emitted before or that we have passed in on the creation of the block source. In the case of a block source that emits every single block, it can just emit the block since the I don't think there is a problem with making this work with block-by-block updating. Any re-org will contradict a block and will work. The problem is with scenarios like CBF where not every block is emitted. A re-org may take place and a non-contradicting block connected before the tip may be emitted that now matches the filter. This won't connect. However, rather than introducing complexity in the API I think the solution is just to do #1005 keeping this problem in mind. Making So in summary I think #1172 will be fine with things the way they are but #1005 should be done before CBF or scanblocks is attempted with |
Okay I agree with this. This is a simpler approach.
This means we need to change pub type ChangeSet = BTreeSet<(BlockId, BlockId)>; I think making |
@LLFourn This can only be done if The alternative for now (without the larger change of making fn introduce_block(&mut self, block: &Block, cp: CheckPoint) -> Result<(), CannotConnectError> { todo!() } |
Just noting that @evanlinjin and I had a call about this and the claim made above is wrong. The |
Yes, this is correct thank you @LLFourn for this comment. |
Implemented in #1172 |
Originally posted by @evanlinjin in #1172 (comment)
Requirements
Ability to ONLY insert relevant checkpoints. Relevant means checkpoints to blocks which
contain relevant transactions. This is important for doing block-by-block syncing. I.e. full node
without CBF, a CBF node (we still want to filter out false-positives), and silent payments (for the
future).
Ability for the block-source to handle reorgs (mid-sync) without requesting data from
bdk::Wallet
.Why
apply_block_connected_to
andapply_block_assume_connected
Does Not SatisfyBoth these methods applies all block checkpoints, no matter if they contain relevant transactions or
not. To solve this, we can have another method,
apply_block_relevant_assume_connected
, that doesnot apply checkpoints of blocks containing no relevant transctions. However, this cannot handle
reorgs mid-sync in an elegant way.
Let's assume we have
apply_block_relevant_assume_connected
which filters checkpoints, and thefolowing scenario plays out:
emitted_initially
is the checkpoints that the chain-source has emitted.relevant
is thecheckpoints that end up being stored in
LocalChain
.emitted_post_reorg
is the checkpoints thatthe chain-source re-emits due to reorg. As can be seen,
A, B'
(the update) cannot connect withA, C
. We needA, B', C'
as the update.My proposal
The chain-source emitter is responsible for emitting full blocks and checkpoints (that connects the
current block to previously emitted blocks).
The block is first processed by
introduce_block_txs
. This inserts relevant transactions andassociated anchors into the wallet.
Then we call
introduce_tip
. If the block contains relevent transactions, theLocalChain
isupdated with this new tip (and only the tip, since we want to skip irrelevant checkpoints). If the
tip is irrelevant, we only update the
last_synced_to_height: u32
value.How does
introduce_tip
work?Imagine a situation where the emitter has emitted block height 1 (with hash
A
) and height 2 (withhash
B
).1:A
is considered relevant and2:B
is not. The state of the wallet'sLocalChain
would be a single checkpoint
[1:A]
.If the next emission is
3:C
and it contains relevant transactions, thetip
input ofintroduce_tip
may contain the update chain[1:A, 2:B, 3:C]
(but of course, we only want toinsert
3:C
and not2:B
). The logic ofintroduce_tip
will iterate the update chain backwardsto determine whether
3:C
can connect with the wallet chain (in this case, it can via1:A
). Withthis knowledge,
try_apply_tip
will create a trimmed update chain[1:A, 3:C]
that is then appliedto just apply the new tip.
introduce_tip
also needs to keep checkpoints that are needed to invalidate original checkpoints.I.e. with a original chain
[1:A, 3:C, 4:D]
and an update chain[1:A, 3:C, 4:D', 5:E]
.apply_tip
still needs to keep height 4 in the update even though it is not the tip. Thestripped-update will be
[3:C, 4:D', 5:E]
.Some tips cannot connect, but we don't return error
Given a scenario with an original chain
[1:A, 3:C, 5:E]
(where the original tip is at height 5).If there is a 2-block reorg, the chain-source emitter will be tempted to emit block at height 4 (as
that is earliest block reorged). The update chain may be
[1:A, 2:B, 3:C, 4:D']
. This cannot beconnect because we cannot know if
4:D'
and5:E
belongs in the same chain. However, it is not theend of the world as the next block emitted will be of height 5. So we just ignore this and wait for
the next emission instead of returning an error.
Initiating syncs
Because we are only include checkpoints which contain relevant transactions, we need somewhere
else to track the last-synced-to-height. This value is used when creating a new instance of a
chain-source-emitter. We need to track the last-synced-to-height in the wallet's changeset.
Optimizing
introduce_tip
Because we are only inserting checkpoints with relevant transactions, inserted checkpoints will be
few and far apart.
I.e. If the original chain is
[1:A, 4:D]
and the next relevant checkpoint is at height 4000,this means we need to do 3996 (4000-4) iterations just to find out if checkpoint at height 4000 can
connect to
4:D
.A solution is to cache the most recent irrelevant checkpoints. For example, when we introduce
checkpoint at height 5 (which is irrelevant), we cache it and associate it with our highest relevant
checkpoint
4:D
. We keep doing this so when we get to height 4000, we can iterate from theintroduced
tip
and find out that the previous node (at height 3999) is the same as the height 3999that is cached. We also know that the cached checkpoint of height 3999 is connected to
4:D
. We cansafely create a trimmed-update of
[4:D, 4000:X]
.Changes to Bitcoind RPC chain source
We need to emit
CheckPoint
s alongside blocks.The text was updated successfully, but these errors were encountered: