Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[forward port] When validation is waiting for parent chain daemon, "stall" #1030

Merged

Conversation

apoelstra
Copy link
Member

Forward-port of #1022

@apoelstra apoelstra force-pushed the 2021-09--forward-port-1022 branch from fdf47da to 3549b8f Compare September 2, 2021 22:03
@apoelstra
Copy link
Member Author

Blocked on fix to #1022.

@apoelstra
Copy link
Member Author

Ready for review @gwillen .. can probably just check that it is a forward-port of #1022

@gwillen gwillen self-requested a review September 13, 2021 18:56
CHECK_NONFATAL(err == "Needs more confirmations.");
bool depth_failed = false;
if(txin.m_is_pegin && !IsValidPeginWitness(mtx.witness.vtxinwit[i].m_pegin_witness, fedpegscripts, txin.prevout, err, true, &depth_failed)) {
assert(depth_failed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check (the string error version) changed from assert to CHECK_NONFATAL since 0.18, apparently. I can't tell whether you changed it back to assert intentionally when rebasing my changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consciously change this to an assert, though I didn't think much about the code history. Let me change it back because I don't remember my reasoning.

src/init.cpp Outdated
// Or gently warn the user, and continue
InitError(Untranslated(err_msg));
gArgs.SoftSetArg("-validatepegin", "0");
if (!MainchainRPCCheck()) { //Initial check only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble reading the range-diff here, but I think you might have removed this (now outdated) comment yourself when rebasing my original PR, and then failed to carry that over to this PR.

@gwillen
Copy link
Contributor

gwillen commented Sep 13, 2021

utACK da11d7b. I ran:

git range-diff -w --color-moved 5b1f3cc00b3c279d94b29b4a1d8c41a7e2b25ad1..upstream/pr/1022 904054ffcb4a67905577a4eb8b256369111be387..upstream/pr/1030

and checked that this appeared to be a faithful rebase of my work, which it is, subject to comments above (which are all nits.)

(It turns out that -w and --color-moved don't actually work right with git range-diff, but that's a complaint for another day.)

apoelstra and others added 3 commits September 13, 2021 22:10
Currently, if -validatepegin is given, and block validation can't proceed
because the parent chain is not synced, we mark the block invalid and put
it in a queue to be "revalidated" later. Unfortunately, marking a block
invalid has downstream consequences, in particular causing descendant blocks
to be marked invalid, which are not currently fixed by the queue.

Instead, we'll use a different strategy: if the mainchain daemon isn't
sufficiently synced to validate a block, we will "stall" connecting that
block to the chain, and have ActivateBestChain simply keep the tip at the
previous block until we're ready.

We can still download and validate (partly) blocks past this point while
we're waiting. They will be connected once the parent chain daemon catches
up.
- Finish removing all references to 'recheckpeginblockinterval', including
  documentation and tests.
- Remove periodic calls to MainchainRPCCheck; use it only at startup (and
  refactor accordingly to simplify logic.)
- Move MainchainRPCCheck from validation.h/cpp (public) to an internal
  helper function of init.cpp.
- Comment out definition of 'revalidation queue' type in txdb, to suppress
  "unused variable" warning. (Leave it visible to avoid future reuse.)
Add a regression test for ElementsProject#891 .

This checks that we can sync successfully when making a bunch of new blocks
just as we have transient loss of parent daemon connectivity. This reliably
fails without the fix, and reliably succeeds with it. (It stands in for the
situation, more common in production, where we sync faster than the parent
daemon can keep up after a long outage.)
@apoelstra apoelstra force-pushed the 2021-09--forward-port-1022 branch from da11d7b to b2e1cc3 Compare September 14, 2021 20:31
@gwillen
Copy link
Contributor

gwillen commented Sep 14, 2021

utACK b2e1cc3, verified that it contains only the requested changes from da11d7b.

[nix-shell:~/src/blockstream/elements/elements]$ git range-diff 904054ffcb4a67905577a4eb8b256369111be387..da11d7b6fd  532d55d059181790eb3b39c734a20f643017b106..b2e1cc38de 
1:  8195a720cc ! 1:  313f73d5b2 When validation is waiting for parent chain daemon, "stall".
    @@ src/rpc/rawtransaction_util.cpp: bool ValidateTransactionPeginInputs(const CMuta
     -            CHECK_NONFATAL(err == "Needs more confirmations.");
     +        bool depth_failed = false;
     +        if(txin.m_is_pegin && !IsValidPeginWitness(mtx.witness.vtxinwit[i].m_pegin_witness, fedpegscripts, txin.prevout, err, true, &depth_failed)) {
    -+            assert(depth_failed);
    ++            CHECK_NONFATAL(depth_failed);
                  immature_pegin = true;
              }
          }
2:  f1764c1b14 ! 2:  d5042b41c8 Finish removing 'recheckpeginblockinterval'; move MainchainRPCCheck
    @@ src/init.cpp: bool AppInitMain(const util::Ref& context, NodeContext& node, inte
     -            // Or gently warn the user, and continue
     -            InitError(Untranslated(err_msg));
     -            gArgs.SoftSetArg("-validatepegin", "0");
    -+        if (!MainchainRPCCheck()) { //Initial check only
    ++        if (!MainchainRPCCheck()) {
     +            const std::string err_msg = "ERROR: elements is set to verify pegins but cannot get a valid response from the mainchain daemon. Please check debug.log for more information.\n\nIf you haven't setup a bitcoind please get the latest stable version from https://bitcoincore.org/en/download/ or if you do not need to validate pegins set in your elements configuration validatepegin=0";
     +            // We fail immediately if this node has RPC server enabled
     +            if (gArgs.GetBoolArg("-server", false)) {
3:  da11d7b6fd = 3:  b2e1cc38de Regression test for pegin validation issues during sync

@apoelstra apoelstra merged commit 250c8e5 into ElementsProject:master Sep 14, 2021
@apoelstra apoelstra deleted the 2021-09--forward-port-1022 branch September 14, 2021 20:48
gwillen pushed a commit that referenced this pull request Jun 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants