Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate service messages during failover/restart when using multiple services #1703

Merged
merged 2 commits into from
Jan 7, 2025

Conversation

JPWatson
Copy link
Collaborator

There is a bug in the Consensus Module where all service messages are appended to the same pending service message queue. The service container sends its service IDs as a session ID whereas the Consensus Module interprets it as a mask.

Where service messages are being sent from multiple services, these can be enqueued in different orders.
This means during failover/restart pending messages can be skipped or duplicated when a new leader is elected.

This change will affect users who are using cluster.offer() from multiple services.

Upgrade procedure
Those affected will need to do a clean shutdown (with a snapshot) and restart the whole cluster with the fix.

Is this upgrade procedure reasonable? How many people who haven't already worked around the duplicate/skipped message behaviour will be affected by this issue?

…g over with uncommitted pending service messages when running with multiple services.
@vyazelenko vyazelenko merged commit 80a93bb into master Jan 7, 2025
34 checks passed
@JPWatson JPWatson deleted the service-msg-dups branch January 7, 2025 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants