Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Fix ATX syncer hangs #6137

Closed
wants to merge 2 commits into from
Closed

Conversation

poszu
Copy link
Contributor

@poszu poszu commented Jul 15, 2024

Motivation

The ATX syncer was observed to be hanging when a peer serves an invalid ATX. It counts only specific errors as failed requests, instead of every failed request and never reaches the configured requests limit.

Description

Change atxsyncer to count every failure to get an ATX as a failed request. Eventually it will give up on the invalid ATX after reaching the limit.

💡 Additionally, decreased the default number of retries from 20 to 10 for the ATX syncer.

💡 There is a related problem - the ATX handler should return a pubsub.ErrValidationReject for invalid ATXs to drop the malicious peer. It should also be smart enough to bubble up such error from fetcher.GetAtxs(<deps>) if the dependency of an ATX (e.g. previous or positioning ATX) turns out to be malicious and cannot be fetched.

Test Plan

I updated a test to check retries on any error.

TODO

  • Explain motivation or link existing issue(s)
  • Test changes and document test plan
  • Update documentation as needed
  • Update changelog as needed

@poszu
Copy link
Contributor Author

poszu commented Jul 15, 2024

Bors merge

spacemesh-bors bot pushed a commit that referenced this pull request Jul 15, 2024
## Motivation

The ATX syncer was observed to be hanging when a peer serves an invalid ATX. It counts only specific errors as failed requests, instead of every failed request and never reaches the configured requests limit.
@spacemesh-bors
Copy link

Pull request successfully merged into develop.

Build succeeded:

@spacemesh-bors spacemesh-bors bot changed the title Fix ATX syncer hangs [Merged by Bors] - Fix ATX syncer hangs Jul 15, 2024
@spacemesh-bors spacemesh-bors bot closed this Jul 15, 2024
@spacemesh-bors spacemesh-bors bot deleted the fix/hanging-atx-syncer branch July 15, 2024 16:43
poszu added a commit that referenced this pull request Jul 16, 2024
## Motivation

The ATX syncer was observed to be hanging when a peer serves an invalid ATX. It counts only specific errors as failed requests, instead of every failed request and never reaches the configured requests limit.
poszu added a commit that referenced this pull request Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants