-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to sync base-mainnet with op-reth || op-node hangs whilst op-reth is 'pruning' #7500
Comments
After these log events, op-reth spams messages such as
with the same set of payload ids (ie 0x40c08df0a246e6bf) repeating themselves over and over |
Possibly duplicates #7477 |
facing a similar issue. base archive node sync with no pruning config, but it still attempts to prune every 5 blocks anyway (nothing is actually pruned). for me the error seems to be due to block 50390. op-node runs into a critical error so further derivation stops completely - https://github.com/ethereum-optimism/optimism/blob/8f516faf42da416c02355f9981add3137a3db190/op-node/rollup/derive/engine_queue.go#L639-L645 for blocks where the only txs are deposit-txs op-node expects that the NewPayload status will always be valid, otherwise it crit errors as above - https://github.com/ethereum-optimism/optimism/blob/8f516faf42da416c02355f9981add3137a3db190/op-node/rollup/derive/engine_update.go#L168-L170 since reth returns FCU status "syncing" when there's an active db write hook, this seems to be causing the error - reth/crates/consensus/beacon/src/engine/mod.rs Lines 360 to 372 in c6857ef
For me, the pruner starts and then takes 13ms to finish (compared to the usual sub 200 µs prune durations). It seems that during the time the pruner has a write hook on the db, op-node errors out due to the Seen a few people mention in tg or in issues about the op node error: After all this, op-node loses all peers and further peer connections are refused. Reth then continually tries building new payloads, but since op-node is essentially dead, there's nothing else it can do. op node logs
reth logs from relevant time
|
related #6796 #6792 (comment) |
yep.. subbing this issue you guys' is more thorough than mine |
Alexey requested further logs. On this run, op-node crashed at 18:24:23.701Z with...
op-reth logs from this period:
|
This occurs running both v0.2.0-beta.5 and head |
I appear unable to reopen |
related #7505 |
I can verify this still happens on latest HEAD Logs op-node:
Logs reth:
|
I ran into this problem too. Didn't have time to look into it in detail, but can confirm that after patching my archive node to disable the pruner, syncing is running without any issues. |
@0xZerohero thanks, good to know syncing is smooth with disabled pruner. |
|
Would you mind sharing how you patched reth to disable the pruning? I'm having the same issue I think. |
+1 |
Noticed that if I restart the op-node after it gets stuck, it's able to resume the sync. |
Thank you @dallonby for sharing the tip on how to solve it. What I did is make a copy of reth.toml and add in
Then I started op-geth with
|
I have tested op-reth on The node syncs faster compare with op-geth and op-erigon, however the following issues were observed on both base mainnet and sepolia:
Configuration used:
ErrorsAs described above, these errors were observed multiple times, using the main branch and also using the latest release [v0.2.0-beta.6](https://github.com/paradigmxyz/reth/tree/v0.2.0-beta.6). op-rethMay 29 07:42:51 m-base-02 op-reth[2595147]: 2024-05-29T07:42:51.216258Z INFO Canonical chain committed number=0 hash=0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd elapsed=124.414µs
May 29 07:42:51 m-base-02 op-reth[2595147]: 2024-05-29T07:42:51.216352Z INFO New payload job created id=0x2544c58091e5276b parent=0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd
May 29 07:42:51 m-base-02 op-reth[2595147]: 2024-05-29T07:42:51.264295Z WARN Error while processing payload error=Failed to insert block (hash=0x07eee59d552f482e0f350933b8e96f824aa0bfa28267de53dcb0ddbe868a4bf9, number=1, parent_hash=0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd): receipt root mismatch: got 0x99d7563fd44dac89d8516b5ba56e109cf073641e9b58a1b8601f80757b2b6d74, expected 0x637d9e6847368bb52b56e90dd608facc34122c153872e8733cb0b01caa581f85
May 29 07:42:51 m-base-02 op-reth[2595147]: 2024-05-29T07:42:51.264339Z WARN Invalid block error on new payload invalid_hash=0x07eee59d552f482e0f350933b8e96f824aa0bfa28267de53dcb0ddbe868a4bf9 invalid_number=1 error=receipt root mismatch: got 0x99d7563fd44dac89d8516b5ba56e109cf073641e9b58a1b8601f80757b2b6d74, expected 0x637d9e6847368bb52b56e90dd608facc34122c153872e8733cb0b01caa581f85
May 29 07:42:51 m-base-02 op-reth[2595147]: 2024-05-29T07:42:51.264366Z WARN Bad block with hash hash=0x07eee59d552f482e0f350933b8e96f824aa0bfa28267de53dcb0ddbe868a4bf9 header=Header { parent_hash: 0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd, ommers_hash: 0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347, beneficiary: 0x4200000000000000000000000000000000000011, state_root: 0xe3e10cb3963f6b88b7ba14cc76c34de65308fc0524dca48410ee53df2a43700b, transactions_root: 0xee45a6283d626ebed087ac705c71740d7ecb276d81b745284a68b88925dc91ae, receipts_root: 0x637d9e6847368bb52b56e90dd608facc34122c153872e8733cb0b01caa581f85, withdrawals_root: None, logs_bloom: 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, difficulty: 0x0_U256, number: 1, gas_limit: 30000000, gas_used: 64013, timestamp: 1686789349, mix_hash: 0xae36673a68b9370a513a06698115fd86cc7faf4d6c248a982fbc3ae29c442da2, nonce: 0, base_fee_per_gas: Some(980000000), blob_gas_used: None, excess_blob_gas: None, parent_beacon_block_root: None, requests_root: None, extra_data: 0x }
May 29 07:42:58 m-base-02 op-reth[2595147]: 2024-05-29T07:42:58.476573Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:43:23 m-base-02 op-reth[2595147]: 2024-05-29T07:43:23.476462Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:43:48 m-base-02 op-reth[2595147]: 2024-05-29T07:43:48.477067Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:44:13 m-base-02 op-reth[2595147]: 2024-05-29T07:44:13.477023Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:44:38 m-base-02 op-reth[2595147]: 2024-05-29T07:44:38.476535Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:45:03 m-base-02 op-reth[2595147]: 2024-05-29T07:45:03.477192Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:45:28 m-base-02 op-reth[2595147]: 2024-05-29T07:45:28.477243Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:45:53 m-base-02 op-reth[2595147]: 2024-05-29T07:45:53.477076Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:46:18 m-base-02 op-reth[2595147]: 2024-05-29T07:46:18.477294Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:46:43 m-base-02 op-reth[2595147]: 2024-05-29T07:46:43.477112Z INFO Status connected_peers=0 freelist=2 latest_block=0
May 29 07:47:05 m-base-02 op-reth[2595147]: 2024-05-29T07:47:05.477104Z WARN Beacon client online, but no consensus updates received for a while. This may be because of a reth error, or an error in the beacon client! Please investigate reth and beacon client logs! period=254.261006858s
May 29 07:47:08 m-base-02 op-reth[2595147]: 2024-05-29T07:47:08.476428Z INFO Status connected_peers=0 freelist=2 latest_block=0 op-nodeMay 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=info msg="created new channel" origin=0xb0beed7a7e12eaf620fcf9cd6f6674ff08d3bc1e4851978ba0824fd43509ddf1:17482370 channel=47d44b9235706648a325ead11fa5121a length=105637 frame_number=0 is_last=true
May 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=info msg="Reading channel" channel=47d44b9235706648a325ead11fa5121a frames=1
May 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=info msg="Found next batch" batch_type=SingularBatch batch_timestamp=1686789349 parent_hash=0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd batch_epoch=0x5c13d307623a926cd31415036c8b7fa14572f9dac64528e857a470511fc30771:17481768 txs=0
May 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=info msg="generated attributes in payload queue" txs=1 timestamp=1686789349
May 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=error msg="cancelling old block sealing job" payload=0x2544c58091e5276b
May 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=warn msg="could not process payload derived from L1 data, dropping batch" err="failed to complete building on top of L2 chain 0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd:0, id: 0x2544c58091e5276b, error (3): execution payload 0x07eee59d552f482e0f350933b8e96f824aa0bfa28267de53dcb0ddbe868a4bf9:1 was INVALID! Latest valid hash is 0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd, ignoring bad block: 0xc0198e7d90"
May 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=error msg="deposit only block was invalid" parent=0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd:0 err="failed to complete building on top of L2 chain 0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd:0, id: 0x2544c58091e5276b, error (3): execution payload 0x07eee59d552f482e0f350933b8e96f824aa0bfa28267de53dcb0ddbe868a4bf9:1 was INVALID! Latest valid hash is 0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd, ignoring bad block: 0xc0198e7d90"
May 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=error msg="Derivation process critical error" err="engine stage failed: crit: failed to process block with only deposit transactions: failed to complete building on top of L2 chain 0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd:0, id: 0x2544c58091e5276b, error (3): execution payload 0x07eee59d552f482e0f350933b8e96f824aa0bfa28267de53dcb0ddbe868a4bf9:1 was INVALID! Latest valid hash is 0xf712aa9241cc24369b143cf6dce85f0902a9731e70d66818a3a5845b296c73dd, ignoring bad block: 0xc0198e7d90"
May 29 07:42:51 m-base-02 op-node[2595204]: t=2024-05-29T07:42:51+0000 lvl=info msg="State loop returned"
May 29 07:45:13 m-base-02 op-node[2595204]: t=2024-05-29T07:45:13+0000 lvl=warn msg="failed to notify engine driver of L1 head change" err="context deadline exceeded"
May 29 07:45:25 m-base-02 op-node[2595204]: t=2024-05-29T07:45:25+0000 lvl=warn msg="failed to notify engine driver of L1 head change" err="context deadline exceeded"
May 29 07:45:37 m-base-02 op-node[2595204]: t=2024-05-29T07:45:37+0000 lvl=warn msg="failed to notify engine driver of L1 head change" err="context deadline exceeded" |
I would advise syncing with |
try this op-node flag to improve speed, then
|
This issue is stale because it has been open for 21 days with no activity. |
This issue was closed because it has been inactive for 7 days since being marked as stale. |
see same issiue on v0.10.2 with/without --l1.trustrpc , a node slowdown for this (after getting in sync within 1 block delay) for this gradually |
@jun0tpyrc the error logs show that it's due to pruning as well? |
found seem not quite the samething although some error /warn may be similar
maybe too io heavy for mainnets ike base-mainnet
suggest if we could include some suggestions for those tunings in doc/tweak tips in repo i am still not consistently stay in sync for long enough period (maybe some slow down after sometimes) |
could you please open a new issue @jun0tpyrc ? |
Describe the bug
Trying to sync op-reth on base-mainnet. Using op-node, op-node often hangs when op-reth is 'pruning' (despite this being an archive node sync).
op-node shows following:
N.B. setting
block_interval = 5
under the prune setting in reth.toml to something obscene like 5000000000 seems to reduce the frequency of the error. When set at 5 I will see the error every 30 seconds or so, and have to restart op-node. With it set at 5000000000 I've not seen the error after ~30 mins of testing.N.N.B. even setting log level to -vvvv in op-reth made it take much longer for the error to appear in op-node. Maybe 5 mins instead of every 30 seconds.
Steps to reproduce
start op-node
./bin/op-node --network="base-mainnet" --l1=http://127.0.0.1:8545 --l1.beacon=http://127.0.0.1:5052 --l2=http://localhost:9551 --l2.jwt-secret=/data/mev-geth/secret.jwt --rpc.addr=0.0.0.0 --rpc.port=9545
start op-reth
./target/release/op-reth node --chain base --rollup.sequencer-http https://sequencer.base.org --http --ws --authrpc.port 9551 --authrpc.jwtsecret /data/mev-geth/secret.jwt --port 30333 --discovery.port 30333 --ws.port 18546 --http.port 18545 --datadir /data/op-reth
Node logs
Platform(s)
Linux (x86)
What version/commit are you on?
./target/release/op-reth --version reth Version: 0.2.0-beta.4 Commit SHA: c04dbe6e9 Build Timestamp: 2024-04-07T07:19:12.388844682Z Build Features: jemalloc,optimism Build Profile: release
What database version are you on?
NA
What type of node are you running?
Archive (default)
What prune config do you use, if any?
[prune] block_interval = 5
If you've built Reth from source, provide the full command you used
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: