-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix OpenTracing trace context propagation with Fault Tolerance @Asynchronous methods #7987
Conversation
c599c08
to
568e411
Compare
Hmm, the failing test passes for me... I'll dig into it. |
Hmm, it seems the test is quite flaky :-( |
I can reproduce the test failure locally when I let the test repeat 1000 times. It typically takes cca 400 attempts for the failure to occur. Not terribly fun to debug and I suspect there's some thread safety issue buried deep in ... I don't know where :-) |
Not sure if it makes sense to backport that one, given the tests are flaky? |
I would love to have this backported, but the test issue indeed needs to be resolved first. |
I tracked this down to a SmallRye Fault Tolerance threading issue: smallrye/smallrye-fault-tolerance#208 Will update this PR when new SmallRye Fault Tolerance is released. |
568e411
to
7235832
Compare
This is now ready. The failing test is unrelated: The test that was previously failing ( |
@jmartisk could you have a look? Thanks! |
JaegerGauge gauge = registry.register(meta(name, MetricType.GAUGE), new JaegerGauge(), toTagArray(tags)); | ||
Tag[] tagArray = toTagArray(tags); | ||
JaegerGauge gauge = (JaegerGauge) registry.getGauges().get(new MetricID(name, tagArray)); | ||
if (gauge == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #7018 we automatically drop and re-create all metric registries during shutdown before a hot reload, so it should not happen anymore that the gauge is still registered. Are you sure we still need this? That would mean the registry dropping is wrong?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen an exception during hot reload on this. I'll try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is still required. When I revert this change, I get this on hot reload:
org.jboss.resteasy.spi.UnhandledException: java.lang.RuntimeException: No reflection exceptions should be thrown unless there is a fundamental error in your code set up.
at org.jboss.resteasy.core.ExceptionHandler.handleException(ExceptionHandler.java:381)
at org.jboss.resteasy.core.SynchronousDispatcher.writeException(SynchronousDispatcher.java:216)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:515)
at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:259)
at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:160)
at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:362)
at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:163)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:245)
at io.quarkus.resteasy.runtime.standalone.RequestDispatcher.service(RequestDispatcher.java:73)
at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatch(VertxRequestHandler.java:122)
at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.access$000(VertxRequestHandler.java:36)
at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler$1.run(VertxRequestHandler.java:87)
at io.quarkus.runtime.CleanableExecutor$CleaningRunnable.run(CleanableExecutor.java:231)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:2027)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1551)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1442)
at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:29)
at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:29)
at java.lang.Thread.run(Thread.java:748)
at org.jboss.threads.JBossThread.run(JBossThread.java:479)
Caused by: java.lang.RuntimeException: No reflection exceptions should be thrown unless there is a fundamental error in your code set up.
at io.jaegertracing.internal.metrics.Metrics.createMetrics(Metrics.java:76)
at io.jaegertracing.internal.metrics.Metrics.<init>(Metrics.java:32)
at io.jaegertracing.internal.metrics.Metrics.<init>(Metrics.java:28)
at io.jaegertracing.Configuration.getTracerBuilder(Configuration.java:221)
at io.quarkus.jaeger.runtime.QuarkusJaegerTracer.tracer(QuarkusJaegerTracer.java:40)
at io.quarkus.jaeger.runtime.QuarkusJaegerTracer.buildSpan(QuarkusJaegerTracer.java:59)
at io.opentracing.util.GlobalTracer.buildSpan(GlobalTracer.java:133)
at io.quarkus.smallrye.opentracing.runtime.TracerProducer_ProducerMethod_tracer_96dadb3d6afa0cccadfe742c3e06ad433737c844_ClientProxy.buildSpan(TracerProducer_ProducerMethod_tracer_96dadb3d6afa0cccadfe742c3e06ad433737c844_ClientProxy.zig:238)
at io.opentracing.contrib.jaxrs2.server.ServerTracingFilter.filter(ServerTracingFilter.java:69)
at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:310)
at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:439)
at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:400)
at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:374)
at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:67)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:488)
... 20 more
Caused by: java.lang.IllegalArgumentException: A metric with metricID MetricID{name='jaeger_tracer_reporter_queue_length', tags=[]} already exists
at io.smallrye.metrics.MetricsRegistryImpl.register(MetricsRegistryImpl.java:131)
at io.quarkus.jaeger.runtime.QuarkusJaegerMetricsFactory.createGauge(QuarkusJaegerMetricsFactory.java:48)
at io.jaegertracing.internal.metrics.Metrics.createMetrics(Metrics.java:66)
... 34 more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmartisk Any idea? Thanks! :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fishy, I'll try to reproduce this and let you know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seriously? 😮 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, I would probably be able to do all that, but that would make this PR pretty large. I just wanted to fix the trace context propagation, found a dev mode issue, included a fix which was tiny, and now this :-/
Here's my proposal: I already have finished the conditional inclusion/exclusion of metrics based on the capability. I have also changed quarkus-jaeger
to no longer depend on smallrye-metrics
and microprofile-metrics-api
; instead, it uses an optional
dependency on quarkus-smallrye-metrics
as you suggest. It seems to work (didn't try native 😁). What's is not done is addressing your 2nd comment, the one starting with "And what would be absolutely optimal" -- I'd leave that for future. Agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's clearly not the goal of this PR.
I created #8502 for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks! I'll update this PR in a few minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR rebased and updated per discussion above.
…hronous methods There used to be a test for that, but didn't involve `@Asynchronous` methods, so was in fact useless. This commit changes the test to actually call an `@Asynchronous` method, which revealed an issue. The `smallrye-fault-tolerance-tracing-propagation` dependency was missing, so trace context propagation didn't actually work. Then, the test also uncovered an issue in SmallRye Fault Tolerance (see smallrye/smallrye-fault-tolerance#208), which is why this commit also brings a new version of it. In addition to that, this commit also adds a test for tracing JDBC.
The Jaeger extension used to collect metrics unconditionally, even when the SmallRye Metrics extension wasn't present. With this commit, the Jaeger extension uses `NoopMetricsFactory` if SmallRye Metrics are missing, and only uses the real metrics factory if SmallRye Metrics are present. This is not totally consistent with how metrics are enabled/disabled in all other extensions. Fixing that is left for future.
7235832
to
1b676f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, per Jan.
There used to be a test for that, but didn't involve
@Asynchronous
methods, so was in fact useless. This commit changes the test to
actually call an
@Asynchronous
method, which revealed an issue.The
smallrye-fault-tolerance-tracing-propagation
dependency wasmissing, so trace context propagation didn't actually work.
Then, the test also uncovered an issue in SmallRye Fault Tolerance
(see smallrye/smallrye-fault-tolerance#208),
which is why this commit also brings a new version of it.
In addition to that, this commit also fixes a devmode issue where
on hot reload, the Jaeger extension tried to register a gauge
that was already present. That ended up with an exception.
And finally, this commit also adds a test for tracing JDBC.
Resolves #7981.
Resolves #7982.
Resolves #7984.