-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write documentation about how to use tracing filters #970
Comments
This PR contains three major code changes: - Update all of our tracing dependencies to the latest versions. - Update `tracing-limit` to use `tracing_subscriber`'s `Layer` trait. - Intrument the main code base with rate limited tracing events. This change introduces the ability for tracing events to be rate limited. The primary reason for including this is for events that may be important for a user to see but may arrive in large bursts. This can cause issues around saturating IO and in general making it difficult to diagnose the issue. The solution is to allow tracing events to provide a `rate_limit_secs` attribute that specifies a window at which we will only see one log and the rest will be counted. This enables users to know that this event is being recorded but will not flood their view into vector. ```rust INFO basic: hello, world! count=0 rate_limit_secs=5 TRACE basic: this field is not rate limited! INFO basic: "hello, world!" is being rate limited. rate_limit_secs=5 TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! INFO basic: 5 "hello, world!" events were rate limited. rate_limit_secs=5 ``` Closes #806 Related to #970 Signed-off-by: Lucio Franco <[email protected]>
This PR contains three major code changes: - Update all of our tracing dependencies to the latest versions. - Update `tracing-limit` to use `tracing_subscriber`'s `Layer` trait. - Intrument the main code base with rate limited tracing events. This change introduces the ability for tracing events to be rate limited. The primary reason for including this is for events that may be important for a user to see but may arrive in large bursts. This can cause issues around saturating IO and in general making it difficult to diagnose the issue. The solution is to allow tracing events to provide a `rate_limit_secs` attribute that specifies a window at which we will only see one log and the rest will be counted. This enables users to know that this event is being recorded but will not flood their view into vector. ```rust INFO basic: hello, world! count=0 rate_limit_secs=5 TRACE basic: this field is not rate limited! INFO basic: "hello, world!" is being rate limited. rate_limit_secs=5 TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! INFO basic: 5 "hello, world!" events were rate limited. rate_limit_secs=5 ``` Closes #806 Related to #970 Signed-off-by: Lucio Franco <[email protected]>
This PR contains three major code changes: - Update all of our tracing dependencies to the latest versions. - Update `tracing-limit` to use `tracing_subscriber`'s `Layer` trait. - Intrument the main code base with rate limited tracing events. This change introduces the ability for tracing events to be rate limited. The primary reason for including this is for events that may be important for a user to see but may arrive in large bursts. This can cause issues around saturating IO and in general making it difficult to diagnose the issue. The solution is to allow tracing events to provide a `rate_limit_secs` attribute that specifies a window at which we will only see one log and the rest will be counted. This enables users to know that this event is being recorded but will not flood their view into vector. ```rust INFO basic: hello, world! count=0 rate_limit_secs=5 TRACE basic: this field is not rate limited! INFO basic: "hello, world!" is being rate limited. rate_limit_secs=5 TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! INFO basic: 5 "hello, world!" events were rate limited. rate_limit_secs=5 ``` Closes #806 Related to #970 Signed-off-by: Lucio Franco <[email protected]>
Noting, this should be placed in https://docs.vector.dev/usage/administration/monitoring and possibly to https://docs.vector.dev/usage/guides/troubleshooting. |
* feat(observability): Add rate limited debug messages This PR contains three major code changes: - Update all of our tracing dependencies to the latest versions. - Update `tracing-limit` to use `tracing_subscriber`'s `Layer` trait. - Intrument the main code base with rate limited tracing events. This change introduces the ability for tracing events to be rate limited. The primary reason for including this is for events that may be important for a user to see but may arrive in large bursts. This can cause issues around saturating IO and in general making it difficult to diagnose the issue. The solution is to allow tracing events to provide a `rate_limit_secs` attribute that specifies a window at which we will only see one log and the rest will be counted. This enables users to know that this event is being recorded but will not flood their view into vector. ```rust INFO basic: hello, world! count=0 rate_limit_secs=5 TRACE basic: this field is not rate limited! INFO basic: "hello, world!" is being rate limited. rate_limit_secs=5 TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! TRACE basic: this field is not rate limited! INFO basic: 5 "hello, world!" events were rate limited. rate_limit_secs=5 ``` Closes #806 Related to #970 Signed-off-by: Lucio Franco <[email protected]>
Update on this issue, I have been working with the other |
We could also update https://vector.dev/docs/reference/cli/#vector_log |
Motivation
We should provide a good experience when users are trying to inspect why vector may have failed. Vector should provide methods in which developres/users can extract extensive debug information on specific components.
Proposal
Write extensive documentation on how users can use the
LOG
environment variable to dig deeper into the logs of vector to diagnose certains issues. These issues can range from high level topology issues to low level tokio task issues.Using the
LOG
environment variable should follow the way directives are setup withintracing-subscriber
. These filters follow roughly what https://docs.rs/env_logger/0.7.0/env_logger/ provides but with additional features to support spans. This additional support is important to allow us to dig deeper into specific components. Possible questions that can be answered are: why are all my sinks failing? why did my http sink fail to gzip my body? why didn't my source task wake up the inner task? All of these questions can be answered by providing the correct filter.Example
This will only provide trace logs for events created within the vector crate and at the trace level.
This will provide debug logs for just the "http" typed sinks.
This will provide debug logs for our HTTP client from the
hyper
crate for just the "http" typed sinks.Blockers
Only one issue is blocking supporting all of these features: tokio-rs/tracing#367.
The text was updated successfully, but these errors were encountered: