Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retries for rmq publish #11

Merged
merged 11 commits into from
May 9, 2024

Conversation

rob-looksrare
Copy link

@rob-looksrare rob-looksrare commented May 8, 2024

Publishing
* graceful restart - ok as is. Commit finishes when last event gets published.
* kill - ok as is. Causes ord to revert to previous savepoint which takes a long time to repair when starting up, but once it's up it re-emits the blocks that were not committed so not event loss.
* rmq failed ack - we retry to create a new connection, if not successful we gracefully shutdown the app and have event loss.
* rmq connection killed - we retry to create a new connection, if not successful we gracefully shutdown the app and have event loss.
* rmq down - we retry to create a new connection, if not successful we gracefully shutdown the app and have event loss.

This should cover most of the cases where we have infra issues.

Reindexing from scratch on the light version takes ~8 hours so we want to avoid it and worst case when rmq goes down we need to spin up a new instance and reindex til the failure.

Consumers

  • graceful restart - ok as is, event goes back to the queue.
  • kill - ok as is, event goes back to the queue.
  • processing error - we requeue 3x and reject the message after.

@rob-looksrare
Copy link
Author

Before merging this need top stop mainnet and let the queues drain in order to recreate them with "x-queue-type": "quorum"

@RareBodhi RareBodhi merged commit e06a30b into feat/publish-events-to-rmq May 9, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants