-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[processor/tailsampling] Late span age histogram should include sampled traces #37180
Conversation
…decision time) Signed-off-by: Sean Porter <[email protected]>
Signed-off-by: Sean Porter <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add a unit test for this fix?
I'll give it a go 👍 |
Signed-off-by: Sean Porter <[email protected]>
@fatsheep9146 I adjusted the existing histogram test to touch sampled traces 👍 I confirmed the test fails on main, only half of the late spans (not sampled) are counted:
|
Looks like an unrelated lint error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, but I really, really think the tail-sampling processor is due to a refactor. "Late span" used to refer to spans that arrived after a sampling decision had been made and dropped, so that people could see why traces are incomplete. With the decision caches, the whole idea of late spans need to be reconsidered. Is it still valuable to track it? Why? If so, it should be better documented.
I agree, it is due for a refactor. This PR, along with #37212 and #37189 are immediate improvements in preparation for such a refactor. Next, we can require the decision caches (breaking change) and optimize the current architecture (e.g. batcher and decision tick loop). It's a very small leap which will result in considerable resource savings. From there, we can gather enough data to direct more significant rework. |
…ed traces (open-telemetry#37180) Currently, the [late span age histogram](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/981e9f5cc58149ea37c8909a4a6eb30faee96351/processor/tailsamplingprocessor/internal/metadata/generated_telemetry.go#L106-L110) only includes non-sampled traces. The histogram should also include sampled traces. This pull request includes anything with a decision time. _Note: The decision cache still wrecks it 😅_ --------- Signed-off-by: Sean Porter <[email protected]> Co-authored-by: Juraci Paixão Kröhling <[email protected]>
Currently, the late span age histogram only includes non-sampled traces. The histogram should also include sampled traces. This pull request includes anything with a decision time.
Note: The decision cache still wrecks it 😅