-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Micrometer observation #1017
Conversation
...ain/java/com/rabbitmq/client/observation/micrometer/DefaultConsumeObservationConvention.java
Outdated
Show resolved
Hide resolved
Please note this OT integration - rabbitmq/rabbitmq-dotnet-client#1261. We should aim for same semantic and tags. |
Awesome! Don't hesitate to ping me in the .NET PR as the spec has probably changed a few times (although I've tried to keep up) since I started this work, and let's keep things in sync if you feel the .NET version should name things differently or vice/versa. Would also be great to get some feedback on tags and operation names from @lmolkova @trask and hopefully more people with OTel knowledge if they have the time. |
Hi, I created an md file to synchronize and track - https://github.com/rabbitmq/contribute/blob/main/observability/clients.md. Please take a look and change/add missing parts |
hi! @lmolkova and I would of course recommend instrumenting with OpenTelemetry API 😄 |
@stebet Thanks for chiming in. I had a look at the .NET client PR. It's similar in concepts to this one: we are using an abstraction (ActivitySource / Micrometer) and we are using OTel naming conventions. I just want to check something with you: is the |
@stebet I had a look at the OTel Java instrumentation agent and it uses |
the OpenTelemetry and .NET community aligned and ActivitySource is (part of) the OpenTelemetry API for .NET: https://opentelemetry.io/docs/instrumentation/net/manual/#add-tags-to-an-activity (while Micrometer Tracing is Spring-specific) |
I might miss something, but my understanding was that micrometer-tracing is an abstraction layer over otel or brave. With brave slowly becoming legacy, what's the benefit of having an extra layer of abstraction? |
The
It does not contradict current spec (which is indeed quite vague), so can be done right away. And I hope the OTEP above gives more clarity on implementation |
Yes it is
Where this assumption comes from?
That we control the api and can replace one tracer into another. Micrometer is very serious about not breaking backward compatibility and the history of current tracing projects such as open tracing or open census warns about what can happen when you rely directly on an api. Of course you can use micrometer observation with otels api directly - you don't have to use micrometer tracing |
@@ -111,6 +120,17 @@ private abstract static class IntegrationTest extends SampleTestRunner { | |||
public TracingSetup[] getTracingSetup() { | |||
return new TracingSetup[] {TracingSetup.IN_MEMORY_BRAVE, TracingSetup.ZIPKIN_BRAVE}; | |||
} | |||
|
|||
void runWithNullingObservation(ObservationRegistry registry, Tracer tracer, Observation.CheckedRunnable<?> runnable) { | |||
Observation noParentObservation = Observation.createNotStarted("null_observation", registry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Micrometer 1.10.8 I think we can change this whole method to become new NullObservation(registry).observeChecked(runnable)
aebc325
to
c4d656f
Compare
headers, | ||
response.getBody() == null ? 0 : response.getBody().length); | ||
Observation observation = | ||
RabbitMqObservationDocumentation.RECEIVE_OBSERVATION.observation( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is its really intentional to start another observation over here?
You already got that Observation.createNotStarted("rabbitmq.receive")
and its observe()
.
Why do we need this child one, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we have 2 observations here.
rabbitmq.receive
is the observation that is responsible for polling. Some trace (e.g. with id X) might have been initiated earlier and suddenly someone callsbasicGet
therabbitmq.receive
observation will warp an operation of polling a message from the brokerRECEIVE_OBSERVATION
is theReceiverContext
based operation that will continue a trace that got propagated through the messaging headers. That means that in the RabbitMq message there could be another trace (e.g. with id Y)
Let's look at it in the following way:
- Application A initiates trace with id X -> calls some logic -> polls for the message from RabbitMq (rabbitmq.receive)
- Application B initiates trace with id Y -> calls some logic -> sends a message to RabbitMQ with tracing header containing Y -> Application A takes the message from the queue and then continues trace id Y (RECEIVE_OBSERVATION)
Since basicGet
is a synchronous operation we were thinking of giving the user a switch to leave the RECEIVE_OBSERVATION
started, and a scope would be opened and then it's up to the user to stop it and stop the scope. That way they will be able to continue the trace id Y
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you do that manually here just like this a bit bellow:
observation.start();
observation.stop();
So, yeah, you connect this RECEIVE_OBSERVATION
to the propagated one, but it is closed immediately.
Whatever end-user would do with the propagated trace over header afterward would be just another, parallel child span, not connected to this one. Plus looks this one is going to be a child of that rabbitmq.receive
in the beginning of this method. No? Is the parent overridden from the value from headers?
I may guess that it is OK to have a simple span connected to that propagated trace just to fulfill the the whole picture of services interaction, but is it really an intention? Wouldn't it be better to leave such a propagation up to end-user with respective API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, yeah, you connect this RECEIVE_OBSERVATION to the propagated one, but it is closed immediately.
Whatever end-user would do with the propagated trace over header afterward would be just another, parallel child span, not connected to this one.
As I said this code
observation.start();
observation.stop();
would have to change to sth like
observation.start();
if (someOptionIsTurnedOn) {
observation.openScope();
} else {
observation.stop();
}
and then the user will have to do sth like this after some listener code gets executed
Scope scope = observationRegistry.getCurrentObservationScope();
if (scope != null) {
scope.close();
scope.getCurrentObservation().stop();
}
Plus looks this one is going to be a child of that rabbitmq.receive in the beginning of this method. No? Is the parent overridden from the value from headers?
Yes, the child will be overridden. So if you started trace X then due to the fact that the message contains X headers, the trace id will continue.
I may guess that it is OK to have a simple span connected to that propagated trace just to fulfill the the whole picture of services interaction, but is it really an intention? Wouldn't it be better to leave such a propagation up to end-user with respective API?
I don't follow, what do you mean exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow, what do you mean exactly?
When we propagate a trace over the broker simple start()-stop()
would just add extra span into a trace, but that wouldn't give a chance for end-user to continue that trace downstream when we step into our business logic.
I think your if (someOptionIsTurnedOn) {
answers to my question.
Thanks
Bumps [jackson-databind](https://github.com/FasterXML/jackson) from 2.14.1 to 2.15.0. - [Release notes](https://github.com/FasterXML/jackson/releases) - [Commits](https://github.com/FasterXML/jackson/commits) --- updated-dependencies: - dependency-name: com.fasterxml.jackson.core:jackson-databind dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>
If observation is enabled.
Could be used for tag values.
Based on the OpenTelemetry attributes. https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md
For Micrometer Observation collector implementation. This is based on what the OTel Java agent exports. The guidelines for attributes may change, but that is a good starting point for now. https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md
Due to the fact that in the tests we have both the producer and the consumer and they have the same parent, the zipkin graph looks bizarre. With these changes we're changing the test sending code to simulate sending a message from a different service that will result in a creation of a new trace identifier. Due to this we will have 2 sets of trace ids created, one for sending and one for polling. Sending: null_observation -> send -> receive message Polling: test_span -> receive (very short)
a2e7dc5
to
afb39ff
Compare
This way it is possible to keep the second observation for the application processing of the message. The observation will have to be closed by the application in this case.
No description provided.