Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High cardinality metric #11988

Open
Lusitaniae opened this issue Aug 22, 2024 · 4 comments
Open

High cardinality metric #11988

Lusitaniae opened this issue Aug 22, 2024 · 4 comments
Labels
community Issues created by community

Comments

@Lusitaniae
Copy link

Lusitaniae commented Aug 22, 2024

Describe the bug
In Prometheus based monitoring systems, metrics with high cardinality (big combination of unique labels) creates issue.

To Reproduce
When we pull metrics we'll get something like this for each validator:

near_current_validator_stake{account_id="01node.poolv1.near", instance="", job="near", num_expected_blocks="112", num_expected_chunks="704", num_produced_blocks="112", num_produced_chunks="703", public_key="ed25519:5xz7EbcnPqabwoFezdJBxieK8S7XLsdHHuLwM4vLLhFt", shards="1", slashed="false"}

Which is highlighted in the screenshot below as having high cardinality (manageable for now)

Expected behavior
num_expected_chunks, num_expected_chunks, num_produced_blocks,num_produced_chunks should be its own metric instead of a label

near_validator_expected_chunks{account_id="01node.poolv1.near"} 112
near_validator_expected_chunks{account_id="01node.poolv1.near"} 704
near_validator_produced_blocks{account_id="01node.poolv1.near"} 112
near_validator_produced_chunks{account_id="01node.poolv1.near"} 703

Screenshots
image

Version (please complete the following information):

  • nearcore
  • mainnet

Additional context
https://docs.victoriametrics.com/faq/#what-is-high-cardinality

@Lusitaniae
Copy link
Author

Lusitaniae commented Aug 23, 2024

Alternatively this could be moved into an external exporter that gathers network wide metrics from a single place

because lots of duplicate metrics for each near node we're running (if we had 100 nodes, we'd have 100x the exact same metrics everywhere)

@nagisa
Copy link
Collaborator

nagisa commented Aug 26, 2024

Where are you getting the cardinality screenshot from? It might be useful to keep a reference handy for this.

Though I imagine we could also implement a cardinality check in neard itself, e.g. at the time when those metrics are gathered together in order to respond to a GET /metrics .

@Lusitaniae
Copy link
Author

The dashboard is from vmui https://docs.victoriametrics.com/#vmui (this is a fork of Prometheus, Victoria Metrics)

There's also projects like https://github.com/thought-machine/prometheus-cardinality-exporter to monitor on this too

@Lusitaniae
Copy link
Author

I think in the end a near_exporter that providers network wide metrics is probably best

@telezhnaya telezhnaya added the community Issues created by community label Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Issues created by community
Projects
None yet
Development

No branches or pull requests

3 participants