-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use tracing
for improved diagnostics
#1533
Comments
I'm personally not a fan of forcing these kind of additions into the code. For example I already strongly dislike the fact that we're using the But what I dislike is the "forcing" part. I'm in favour of adding (optional) wrappers around |
Would a disabled-by-default cargo feature also fulfill this requirement? (Assuming it doesn't add too much of a maintenance burden to the codebase.)
From what I understand, the main additions of tracing spans and events would have to happen in I'll try to get up a draft PR to show what I mean. |
It would be totally awesome if I could somehow mark whole execution flow with some generated ID, and then figure out which request triggered what, and for how long it spent where. But I imagine it would require big rehaul of the libp2p-core codebase, which is unfortunate at best. @thomaseizinger what do you have ( did? :) ) in mind, mark all tracing events with |
Back when I created this ticket, I was primarily interested in the Unfortunately, I never got around to follow up with a draft PR 😅 I think it might be possible to achieve this with wrappers around components that are passed to the Conceptually, what we need is:
|
Hello, i listened in on the maintainers call today and heard you guys mention this issue. From what I understood, adding tracing should be relatively straightforward and can be done in chunks. I think it was also mentioned that it would be nice to have this tracing specifically for Kademlia, so maybe thats a good place to start? Thomas also provided this PR firezone/firezone#1741 as an example of how he implemented tracing in a different repo, so this could serve as a nice blueprint to work off from. I'd really like to work on this feature. I have virtually no experience with the inner workings of libp2p, but maybe a task like this would be a good introduction? |
Great! Thank you :) I think a useful place to start is here: https://github.com/libp2p/rust-libp2p/blob/master/swarm/src/connection.rs This is the main state machine that drives each connection. It has the context of which peer we are connected to and what connection ID we assigned to it. I would start by capturing this in an "error" span. Following from that, you can change one of the protocols from |
tracing
for improved diagnosticstracing
for improved diagnostics
We replace `log` with `tracing` across the codebase. Where possible, we make use of structured logging now instead of templating strings. `tracing` offers the ability to also record "span"s. A span lasts until its dropped and describes the entire duration that it is active for. All logs (in `tracing` term "events") are hierarchically embedded in all parent-spans). We introduce several spans: - On debug level: One for `new_outgoing_connection`, `new_incoming_connection` and `new_established_connection` - On debug level: `Connection::poll`, `Swarm::poll` and `Pool::poll` - On trace level: `NetworkBehaviour::poll` for each implementation of `NetworkBehaviour` - On trace level: `ConnectionHandler::poll` for each implementation of (protocol) `ConnectionHandler`s The idea here is that logging on debug level gives you a decent overview of what the system is doing. You get spans for the duration of connections and how often each connection gets polled. Dropping down to trace level gives you an extremely detailed view of how long each individual `ConnectionHandler` was executed as part of `poll` which could be used for detailed analysis on how busy certain handlers are. Most importantly, simply logging on `info` does not give you any spans. We consider `info` to be a good default that should be reasonably quiet. Resolves #1533. Pull-Request: #4282.
We replace `log` with `tracing` across the codebase. Where possible, we make use of structured logging now instead of templating strings. `tracing` offers the ability to also record "span"s. A span lasts until its dropped and describes the entire duration that it is active for. All logs (in `tracing` term "events") are hierarchically embedded in all parent-spans). We introduce several spans: - On debug level: One for `new_outgoing_connection`, `new_incoming_connection` and `new_established_connection` - On debug level: `Connection::poll`, `Swarm::poll` and `Pool::poll` - On trace level: `NetworkBehaviour::poll` for each implementation of `NetworkBehaviour` - On trace level: `ConnectionHandler::poll` for each implementation of (protocol) `ConnectionHandler`s The idea here is that logging on debug level gives you a decent overview of what the system is doing. You get spans for the duration of connections and how often each connection gets polled. Dropping down to trace level gives you an extremely detailed view of how long each individual `ConnectionHandler` was executed as part of `poll` which could be used for detailed analysis on how busy certain handlers are. Most importantly, simply logging on `info` does not give you any spans. We consider `info` to be a good default that should be reasonably quiet. Resolves libp2p#1533. Pull-Request: libp2p#4282.
Tracing: A scoped, structured logging and diagnostics system.
rust-libp2p is designed as an async, event-based system. This makes it fairly hard to add rich logs in user land because peer information is not always available. For example, the
InboundUpgrade
trait doesn't know, which peer it actually connected to.It might be worth considering to add a dependency on the
tracing
crate and createSpan
s around the calls to these methods that set certain variables in the context.A downstream crate that also uses tracing would then automatically pick those scopes up and messages printed to the log file would automatically include this context. For example, logging a message like: "Upgrading inbound substream to /foo/bar/1.0.0" would show up as:
In this example,
libp2p
is a tracing scope andpeer_id
is a variable inside that scope.Scopes also have log levels. A "trace" level scope will not show up as part of an "info" message etc.
Note:
tracing
is maintained by the folks fromtokio
but works totally independent of the tokio runtime.The text was updated successfully, but these errors were encountered: