Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate burst of invalid messages right at bootstrap time #761

Closed
masih opened this issue Nov 28, 2024 · 4 comments
Closed

Investigate burst of invalid messages right at bootstrap time #761

masih opened this issue Nov 28, 2024 · 4 comments

Comments

@masih
Copy link
Member

masih commented Nov 28, 2024

Immediately after bootstrap there seem to be a burst of invalid messages. Understand why; e.g. is there a timing issue where we may recieve QUALITY messages for an instance that we have not strated yet and drop them as invalid? In which case the very first bootstrap instance will suffer. Since those errors will prohibit propagation of QUALITY messages.

image

Grafana data to analyze: https://grafana.f3.eng.filoz.org/d/edsu1k5s7gtfkb/f3-passive-testing?orgId=1&from=1732801290780&to=1732802842780&var-network=mainnet&var-instance=ida.f3.eng.filoz.org:80

@masih
Copy link
Member Author

masih commented Nov 28, 2024

Note that this is happening right at the very beginning, i.e. start of instance 0. Occurrences in successive instances is expected due to lingering non DECIDE messages from previous instance.

@masih
Copy link
Member Author

masih commented Nov 29, 2024

Adjustments made in network 20, 21 and 22 seem to have removed the burst of invalid messages at initial instance. Screenshot below is from bootstrap of network 22 on mainnet:

image

We need to get to the bottom of it.

@BigLep
Copy link
Member

BigLep commented Jan 21, 2025

@masih : thoughts on whether to keep this issue open? I'm thinking to close it since we don't have anything immediate to action. If it turns out to be a problem during nv25 passive testing we can reopen. (I'll defer to you though.)

@masih
Copy link
Member Author

masih commented Jan 21, 2025

I believe the root cause of this was a missing metric tag for an error type that was added later no: ErrValidationNotRelevant. This tagging was added in #766.

I'll close this and we can always reopen if the issue persists in the next round of testing.

@masih masih closed this as completed Jan 21, 2025
@github-project-automation github-project-automation bot moved this from Todo to Done in F3 Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

2 participants