Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(prom): expose cluster id in identity #3554

Merged
merged 1 commit into from
Oct 12, 2021

Conversation

johanneswuerbach
Copy link
Contributor

@johanneswuerbach johanneswuerbach commented Oct 11, 2021

Proposed Changes

Expose the persistent cluster id as label under rabbitmq_identity_info to reliably detect how many nodes are part of a cluster. Unlike the cluster_name, which can be the same for across multiple clusters, the persistent cluster id is unique per cluster and allows to reliably detect split-brain scenarios.

Types of Changes

What types of changes does your code introduce to this project?
Put an x in the boxes that apply

  • Bug fix (non-breaking change which fixes issue #NNNN)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • Documentation improvements (corrections, new content, etc)
  • Cosmetic change (whitespace, formatting, etc)
  • Build system and/or CI

Checklist

Put an x in the boxes that apply.
You can also fill these out after creating the PR.
If you're unsure about any of them, don't hesitate to ask on the mailing list.
We're here to help!
This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING.md document
  • I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
  • I have added tests that prove my fix is effective or that my feature works
  • All tests pass locally with my changes
  • If relevant, I have added necessary documentation to https://github.com/rabbitmq/rabbitmq-website
  • If relevant, I have added this change to the first version(s) in release-notes that I expect to introduce it

Further Comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc.

@michaelklishin
Copy link
Member

That ID was meant to only be used internally. I don't mind it being exposed too much as it is the only non-volatile value of this kind.

@johanneswuerbach
Copy link
Contributor Author

johanneswuerbach commented Oct 11, 2021

We recently suffered a case where a 3 node cluster split into 3 single node clusters and there seems to be currently no easy way to alert on this using prometheus.

I'm aware off #2508, but the having the cluster id exposed would have been a really good indicator for us that we are in a broken state. In our case each node had a different one despite having the same configured cluster name.

Copy link
Member

@michaelklishin michaelklishin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many team members see no issue in adopting this but I'd like to improve field naming before we do.

@gerhard
Copy link
Contributor

gerhard commented Oct 12, 2021

This is a good one, thank you for contributing 🚀

@michaelklishin michaelklishin merged commit e087403 into rabbitmq:master Oct 12, 2021
michaelklishin added a commit that referenced this pull request Oct 12, 2021
feat(prom): expose cluster id in identity (backport #3554)
michaelklishin added a commit that referenced this pull request Oct 12, 2021
feat(prom): expose cluster id in identity (backport #3554) (backport #3561)
@johanneswuerbach johanneswuerbach deleted the expose-cluster-id branch October 12, 2021 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants