Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add an mr_receive_batch_latency_seconds metric #2417

Merged
merged 2 commits into from
Nov 5, 2024

Conversation

alin-at-dfinity
Copy link
Contributor

The metric tracks the latency of the Receiver::recv() call that MessageRoutingImpl makes to obtain the next batch from Consensus.

On a subnet with a block rate of 2.5 blocks/s whose DSM does zero work and finishes immediately, we should consistently observe values around 400ms. If OTOH the DSM is backlogged and always takes more than 400ms to run, then the observed values should be consistently around zero (because the DSM would be consistently behind Consensus).

The metric tracks the latency of the `Receiver::recv()` call that `MessageRoutingImpl` makes to obtain the next batch from Consensus.

On a subnet with a block rate of 2.5 blocks/s whose DSM does zero work and finishes immediately, we should consistently observe values around 400ms. If OTOH the DSM is backlogged and always takes more than 400ms to run, then the observed values should be consistently around zero (because the DSM would be consistently behind Consensus).
Copy link
Contributor

@ShuoWangNSL ShuoWangNSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alin-at-dfinity alin-at-dfinity added this pull request to the merge queue Nov 5, 2024
Merged via the queue into master with commit 4991a57 Nov 5, 2024
24 checks passed
@alin-at-dfinity alin-at-dfinity deleted the alin/receive_batch_duration-metric branch November 5, 2024 23:26
alin-at-dfinity added a commit that referenced this pull request Nov 7, 2024
The metric tracks the latency of the `Receiver::recv()` call that
`MessageRoutingImpl` makes to obtain the next batch from Consensus.

On a subnet with a block rate of 2.5 blocks/s whose DSM does zero work
and finishes immediately, we should consistently observe values around
400ms. If OTOH the DSM is backlogged and always takes more than 400ms to
run, then the observed values should be consistently around zero
(because the DSM would be consistently behind Consensus).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants