-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - collect bandwidth metrics per transport #4805
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of a hack, but neat!
Long term, I would consider a proper solution in rust-libp2p the cleaner solution, especially in case you ever introduce a third transport.
That said, for now, ...
beacon_node/network/src/metrics.rs
Outdated
pub static ref INBOUND_LIBP2P_QUIC_BYTES: Result<IntGauge> = | ||
try_create_int_gauge("libp2p_inbound_quic_bytes", "The inbound quic bandwidth over libp2p"); | ||
|
||
pub static ref INBOUND_LIBP2P_TCP_BYTES: Result<IntGauge> = | ||
try_create_int_gauge("libp2p_inbound_tcp_bytes", "The inbound tcp bandwidth over libp2p"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub static ref INBOUND_LIBP2P_QUIC_BYTES: Result<IntGauge> = | |
try_create_int_gauge("libp2p_inbound_quic_bytes", "The inbound quic bandwidth over libp2p"); | |
pub static ref INBOUND_LIBP2P_TCP_BYTES: Result<IntGauge> = | |
try_create_int_gauge("libp2p_inbound_tcp_bytes", "The inbound tcp bandwidth over libp2p"); | |
pub static ref INBOUND_LIBP2P_QUIC_BYTES: Result<IntGauge> = | |
try_create_int_gauge("libp2p_inbound_bytes", "The inbound bandwidth over libp2p"); |
Best practice would be to use a metric family for all bandwidth, differentiated by a Prometheus label, e.g. transport="tcp|quic"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah makes sense thanks, went ahead and also put the direction into the same metric
beacon_node/network/src/metrics.rs
Outdated
@@ -218,13 +218,30 @@ lazy_static! { | |||
/* | |||
* Bandwidth metrics | |||
*/ | |||
pub static ref INBOUND_LIBP2P_QUIC_BYTES: Result<IntGauge> = | |||
try_create_int_gauge("libp2p_inbound_quic_bytes", "The inbound quic bandwidth over libp2p"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This metrics only ever goes up, right? Shouldn't it be a counter instead of a gauge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with counters we have to increase the counter, with gauges we can set the value. I think @jxs has to set the value based on the bandwidth at regular intervals. I'll let him comment tho.
Agree that a counter would be better if we can set it easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may use Counter::inner
. Note that using a Gauge in a place where a Counter is correct might result in problems e.g. around Grafana's query suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup it was because of @AgeManning's arguments ehe, managed to switch to a Counter
and reset
and inc_by_value
thanks Max, ptal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is very helpful.
Agree with Max's comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typos
Co-authored-by: Jack McPherson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Again, long term I think a solution in rust-libp2p would be much cleaner. That said, this should provide us with the numbers we need to evaluate QUIC. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, looks good to me. Lets go
Just waiting for imminent deneb merge |
let's go! |
## Issue Addressed Following the conversation on libp2p/rust-libp2p#3666 the changes introduced in this PR will allow us to give more insights if the bandwidth limitations happen at the transport level, namely if quic helps vs yamux and it's [window size limitation](libp2p/rust-yamux#162) or if the bottleneck is at the gossipsub level. ## Proposed Changes introduce new quic and tcp bandwidth metric gauges. cc @mxinden (turned out to be easier, Thomas gave me a hint)
Pull request successfully merged into unstable. Build succeeded! The publicly hosted instance of bors-ng is deprecated and will go away soon. If you want to self-host your own instance, instructions are here. If you want to switch to GitHub's built-in merge queue, visit their help page.
|
Issue Addressed
Following the conversation on libp2p/rust-libp2p#3666 the changes introduced in this PR will allow us to give more insights if the bandwidth limitations happen at the transport level, namely if quic helps vs yamux and it's window size limitation or if the bottleneck is at the gossipsub level.
Proposed Changes
introduce new quic and tcp bandwidth metric gauges.
cc @mxinden (turned out to be easier, Thomas gave me a hint)