"Previous epoch attestation(s)" errors and warnings—how to remedy or at least reduce? #3413

JamesCropcho · 2022-08-02T19:45:35Z

Description

The log of lighthouse beacon_node (when restricted to ERRO and WARN) has clumps of entries like:

13:01:08.219 ERRO Previous epoch attestation(s) missing   validators: ["val1", "val2"], epoch: 137070, service: val_mon
13:01:08.219 WARN Previous epoch attestation(s) failed to match head, validators: ["val1", "val3", "val2"], epoch: 137070, service: val_mon
13:01:08.219 WARN Previous epoch attestation(s) failed to match target, validators: ["val1", "val2"], epoch: 137070, service: val_mon
13:01:08.219 WARN Previous epoch attestation(s) had sub-optimal inclusion delay, validators: ["val3"], epoch: 137070, service: val_mon

These clumps appear perhaps once every two hours on a beacon node whose validator client has ~100 validators. Notable configuration includes:

--validator-monitor-auto
--target-peers 18
--http-disable-legacy-spec

--block-cache-size is left to the default.

--target-peers was lowered in the interest of reducing compute costs by reducing outbound traffic (aside: however, honestly the change does not appear to have affected the volume of outbound traffic).

The clumps do not always include the ERRO entry.

Version

https://github.com/sigp/lighthouse/releases/download/v2.5.0/lighthouse-v2.5.0-aarch64-unknown-linux-gnu.tar.gz

Expected Behaviour

I have a lot of experience administering decentralized systems and I do understand that a certain frequency of—and types of—warnings are a normal part of healthy operation. However, 1) I have an interest in optimizing this staking infrastructure both from the perspective of maximizing rewards as well as minimizing compute costs; and 2) errors are often more severe than warnings, so if the same one occurs regularly I at least need to comprehend it.

Steps to Resolve

If anyone is able to tell me whether any of the below tactics (or others) may possibly reduce the frequency or severity (e.g. remedy the ERRO) of said events, and hence to resolve the issue, or—just as valuable—if any/all are likely to have no helpful effect, I would appreciate it:

Increase/decrease target-peers
Increase/decrease block-cache-size
Increase the IOPS (I/O ops per second) available to the SDD storage of the data-dir
Increase the throughput (MB per second) available to the SDD storage of the data-dir
Increase the number of CPU/vCPU cores of the cloud instance running lighthouse beacon_node
Increase the allotted network performance capacity of the cloud instance running lighthouse beacon_node

Thanks for reading.

The text was updated successfully, but these errors were encountered:

pawanjay176 · 2022-08-02T23:08:41Z

I think the first thing to do is increase your target peers to the default value (80). Keeping your target peers at such a low value will lead to your node not finding enough peers in the required attestation subnets which will lead to your attestation not reaching enough aggregators and eventually not getting included in blocks.

Increasing the target peers will not have that big effect on the bandwidth, that depends on gossipsub protocol configs. See this comment for more details #3005 (comment)

JamesCropcho · 2022-08-03T16:02:46Z

Thank you for your insight and helpful reply, Pawan, as well as the link to the earlier comment.

Okay, I have increased target-peers to the default.

Also, "for fun" I have increased block-cache-size to 15—my lighthouse beacon_node seems to require 8 cores and I have a suplus of RAM. If you believe I should not mess with block-cache-size kindly let me know and I'll restore the default value. My thinking here is along the lines of, "eh, couldn't hurt," but I may be mistaken.

I have nothing to report as I'll have to wait at least 24h to see what happens, but I can report back upon request.

Thank you again.

JamesCropcho closed this as completed Aug 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Previous epoch attestation(s)" errors and warnings—how to remedy or at least reduce? #3413

"Previous epoch attestation(s)" errors and warnings—how to remedy or at least reduce? #3413

JamesCropcho commented Aug 2, 2022 •

edited

Loading

pawanjay176 commented Aug 2, 2022

JamesCropcho commented Aug 3, 2022

"Previous epoch attestation(s)" errors and warnings—how to remedy or at least reduce? #3413

"Previous epoch attestation(s)" errors and warnings—how to remedy or at least reduce? #3413

Comments

JamesCropcho commented Aug 2, 2022 • edited Loading

Description

Version

Expected Behaviour

Steps to Resolve

pawanjay176 commented Aug 2, 2022

JamesCropcho commented Aug 3, 2022

JamesCropcho commented Aug 2, 2022 •

edited

Loading