Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why jaegerthrifthttpexporter got removed? #4667

Closed
chengtangzheng2021 opened this issue Aug 16, 2021 · 31 comments
Closed

why jaegerthrifthttpexporter got removed? #4667

chengtangzheng2021 opened this issue Aug 16, 2021 · 31 comments
Labels
comp:jaeger Jaeger related issues

Comments

@chengtangzheng2021
Copy link

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
I used version 29 with jaegerthrifthttpexporter, it is working well, I used jaeger gRPC exporter, it is crashing the jaeger collector at big test request volume.
Describe the solution you'd like
A clear and concise description of what you want to happen.
I tried to upgrade to version 31, got unknown exporters type "jaeger_thrift" for jaeger_thrift
looks into the code found jaegerthrifthttpexporter is not there
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

@mx-psi
Copy link
Member

mx-psi commented Aug 17, 2021

See #4083, cc @jpkrohling

@chengtangzheng2021
Copy link
Author

#4083 said "Jaeger Agents do not offer the ability to send data to the Jaeger Collector using HTTP Thrift for more than a year now"
About this comment, the function of otel collector is the same as Jaeger agent regarding as a tracing buffer, good practice is that otel collector sends tracing to jaeger collector directly, with otel collector being there, no need to have a Jaeger agent to be there.
So whether Jaeger agent support thrift or not, doesn't matter, only need jaeger collector support it.

@chengtangzheng2021
Copy link
Author

{"level":"info","ts":1629211743.3203762,"caller":"server/grpc.go:76","msg":"Starting jaeger-collector gRPC server","grpc.host-port":":14250"}
{"level":"info","ts":1629211743.3203762,"caller":"server/http.go:48","msg":"Starting jaeger-collector HTTP server","http host-port":":14268"}
Just download Jaeger and started it, it started a http thrift server with 14268 and a gRPC server at 14250
So the current newest version of jaeger collector support thrift/http.

@jpkrohling
Copy link
Member

I used version 29 with jaegerthrifthttpexporter, it is working well, I used jaeger gRPC exporter, it is crashing the jaeger collector at big test request volume.

Could you please define "crashing"? We should fix that instead of reinstating an exporter that isn't going the default option in Jaeger itself.

So whether Jaeger agent support thrift or not, doesn't matter, only need jaeger collector support it

The point was that the gRPC transport between the agent and the collector has been the default for a long time and is tried/tested.

@chengtangzheng2021
Copy link
Author

the jaeger collector start two servers, one is of gRPC/http server, one is of thrift/http server.
those two servers are started by default.
Having two exporters there is much better, if one is not working, then can use another one.
Of gRPC exporter to be working well, the exporter and the jaeger collector has to be configured well and there are lots of configure parameter.
My use of gRPC exporter is not stable, some time is working, some time is not, run single request is working, run jmeter test is not working.

@jpkrohling
Copy link
Member

Having two exporters there is much better

I agree with that, but having two exporters means twice the number of exporters to support.

My use of gRPC exporter is not stable, some time is working, some time is not, run single request is working, run jmeter test is not working.

I'm really more interested in fixing whatever might be broken instead. Can you please share the steps I have to take to reproduce the issues you are having?

@chengtangzheng2021
Copy link
Author

chengtangzheng2021 commented Aug 19, 2021

2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 9.999074503s. [remote_addr=proxy-aus.xxx.com/10.170.20.117:80]
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at io.grpc.Status.asRuntimeException(Status.java:535)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
2021-08-19T14:25:24.56-0400 [APP/PROC/WEB/0] ERR at java.lang.Thread.run(Thread.java:748)

@chengtangzheng2021
Copy link
Author

chengtangzheng2021 commented Aug 19, 2021

this got this error running simple test request,

@chengtangzheng2021
Copy link
Author

with and without batch process config got this error now.
before was just Jmeter test got error and made the jaeger collector down.
But some apps using gRPC are working, some apps are not.

@chengtangzheng2021
Copy link
Author

2021-08-19T14:35:12.31-0400 [APP/PROC/WEB/1] ERR [otel.javaagent 2021-08-19 18:35:12:311 +0000] [grpc-default-executor-5] WARN io.opentelemetry.exporter.jaeger.JaegerGrpcSpanExporter - Failed to export spans

@chengtangzheng2021
Copy link
Author

your comment: I agree with that, but having two exporters means twice the number of exporters to support.
More exporter being there, means the product is more powerful, if no jaeger thrift exporter being there, our apps will be able to use otel collector.

@chengtangzheng2021
Copy link
Author

the above error is from javaagent gRPC jaeger exporter.
let me try to retrieve the otel gRPC exporter error.

@chengtangzheng2021
Copy link
Author

chengtangzheng2021 commented Aug 19, 2021

this is the error from otel collector gRPC exporter:
2021-06-22T15:36:52.38-0400 [APP/PROC/WEB/0] ERR 2021-06-22T19:36:52.380Z debug jaegerexporter/exporter.go:131 failed to push trace data to Jaeger {"kind": "exporter", "name": "jaeger", "error": "rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.176.195.3:14250: connect: connection refused""}
2021-06-22T15:36:52.38-0400 [APP/PROC/WEB/0] ERR 2021-06-22T19:36:52.380Z info exporterhelper/queued_retry.go:314 Exporting failed. Will retry the request after interval. {"kind": "exporter", "name": "jaeger", "error": "failed to push trace data via Jaeger exporter: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.176.195.3:14250: connect: connection refused"", "interval": "11.369891452s"}

sometimes this error is intermittent in the test, looks it is making lots re-connects.

@chengtangzheng2021
Copy link
Author

Our apps using otel jaeger thrift exporter, passed performance test going to get deployed to prod in days, so hope new version of otel collector still have it there. we are using otel collector version 29

@chengtangzheng2021
Copy link
Author

chengtangzheng2021 commented Aug 19, 2021

Run Jmeter test, run for a while, the connect: connection refused error will occur. Sometime down the jaeger collector, sometime, stopped the test, the jaeger collector was still working.

@chengtangzheng2021
Copy link
Author

And of node js openTelemetry's jaeger export, it only support thrift protocol, with using otel collector jaeger thrift exporter, no protocol changing in the collector, it will be more efficient.

@chengtangzheng2021
Copy link
Author

Got error from javaagent gRPC exporter and otel collector gRPC exporter, looks the issue is that the jaeger gRPC server is not mature.

@jpkrohling
Copy link
Member

There's quite a lot to process here, but the TL;DR is: please, provide step-by-step instructions on what you are doing, with source code and commands, that reproduces the performance issue. Without that, I can't help you much.

new version of otel collector still have it there. we are using otel collector version 29

We are not reinstating the Jaeger Thrift exporter unless we have a good reason. So far, this ticket isn't providing a good reason.

Run Jmeter test, run for a while, the connect: connection refused error will occur. Sometime down the jaeger collector, sometime, stopped the test, the jaeger collector was still working.

Are you saying that the Jaeger Collector is down? Did it crash? Was it killed? How are you running the Jaeger Collector, and with which database? If you are using the in-memory storage, as you setting the --memory.max-traces flag? Are you able to watch the metrics during the test, until the collector shuts down? Please, provide information like this to make the ticket actionable.

And of node js openTelemetry's jaeger export, it only support thrift protocol

This has nothing to do with the Jaeger Thrift exporter in the collector. The collector still has a Jaeger Thrift receiver, so that it can receive data from a Jaeger Agent or OpenTelemetry NodeJS Jaeger exporter.

looks the issue is that the jaeger gRPC server is not mature.

I'm repeating myself, but please provide a runnable test case demonstrating that. At this point, I'm assuming your Jaeger Collector is misconfigured for your use case.

@chengtangzheng2021
Copy link
Author

The Jaeger collector is owned by another team, my team just used their jaeger collectors, the gPRC exporter not working issue, blocked the development for some weeks, they said their Jaeger collector are good, and had been there running with no issue for a round a year, but all their clients are not using otel collector.
we tried the jaeger thrift exporter, and it is working well, so we took using otel collector as our solution, otherwise otel collector was not going to be our solution.

@jpkrohling
Copy link
Member

That's fair. In any case, there's still very little that I can use in this ticket. I can try to use tracegen to generate spans to otelcol and exported spans via gRPC to Jaeger, but I'm sure I've done it in the past.

My request to you once again: please provide a runnable test case demonstrating the problem.

@chengtangzheng2021
Copy link
Author

My Jmeter test also made the jaeger gui dashboard down, I'm not allowed to run Jmeter test to bring down their collector again. Before, my Jmeter test made their jaeger collector not working, I was not allowed to use their collector for sometime, they said they were going to set up new jaeger collector for me, but they had not get it done for months.
And don't have story to work on this issue, because the Jaeger thrift exporter is working well, the Otel collector using thrift and all 7 apps are all merged into master branch.

@chengtangzheng2021
Copy link
Author

chengtangzheng2021 commented Aug 20, 2021

receivers:
jaeger:
protocols:
grpc:
endpoint: 0.0.0.0:8080
write_buffer_size: 128000
read_buffer_size: 128000
jaeger/2:
protocols:
thrift_http:
endpoint: 0.0.0.0:8081
processors:
batch:
send_batch_size: 100000
timeout: 5s
send_batch_max_size: 0
batch/2:
send_batch_size: 100000
timeout: 5s
send_batch_max_size: 0
exporters:
jaeger_thrift:
url: http://DTINWR2CSVC01.xxx.com:14268/api/traces
timeout: 5s
jaeger_thrift/2:
url: http://DTINWR2CSVC04.xxx.com:14268/api/traces
timeout: 5s
service:
pipelines:
traces:
receivers: [jaeger]
processors: [batch]
exporters: [jaeger_thrift]
traces/2:
receivers: [jaeger/2]
processors: [batch/2]
exporters: [jaeger_thrift/2]

this is my config file, with in the place of thrift exporter are gRPC exporters, Jmeter test had connection refused error, tries may parameters that had no effect on git away the issue

@chengtangzheng2021
Copy link
Author

chengtangzheng2021 commented Aug 23, 2021

There are lots real world cases, can be managed and can not be managed.
Good product like otel collector, should be able to be used in all these cases, not just in simple case.
Out jaeger collector may not configured well, but the owner of it had worked on it for months, and still was not able to make it work.
And the jaeger thrift exporter is working well and come to rescue.
I think jaeger's thrift http server, is more mature than its gRPC server.
with the deleting the jaeger exporter, we may forced to re-develop our tracing app, may looking solution to use jaeger agent instead of using otel collector, or just not using otel tracing

@jpkrohling
Copy link
Member

Please provide a runnable test case demonstrating the problem.

@chengtangzheng2021
Copy link
Author

the jeager collector automatically start a http thrift server and a gRPC server.
Even using gRPC exporter to send tracing to jaeger colleter, with jaeger thrift exporter being there, it will be a very good fallback solution, any time or any case the gRPC protocol not working, can just make an otel collector configure file change to switch to the jaeger thrift exporter.

@chengtangzheng2021
Copy link
Author

chengtangzheng2021 commented Aug 24, 2021

exporters:
jaeger:
endpoint: DTINWR2CSVC01.xxx.com:14250
insecure: true
jaeger/2:
endpoint: DTINWR2CSVC04.xxx:14250
insecure: true
Changed it to be gRPC exporter and run Jmeter test, the issue doesn't occur for this test,
It has been not stable, the issue, has been intermittent.
The cause of the issue may not with otel gRPC exporter, may be with the jaeger gRPC server.
Using javaagent's gPRC also has issue, it is also intermittent.
But with otel jaeger thrift exporter with the same jaeger collector, it is stably working, never got an error.

@alolita alolita added the comp:jaeger Jaeger related issues label Sep 8, 2021
@ctreatma
Copy link
Contributor

ctreatma commented Oct 1, 2021

I found this issue while digging into options for the Jaeger exporter. In my case, I need to ship traces from the OpenTelemetry collector to a Jaeger collector, not an agent. Due to limitations of the platform on which we're running our collectors, the Jaeger collector cannot receive HTTP/2 connections, which means it cannot receive gRPC connections, so the removal of the Jaeger Thrift exporter has removed my ability to use the OpenTelemetry collector.

Does the OpenTelemetry collector provide a path for us to create our own Jaeger Thrift exporter and pull it in at runtime?

@jpkrohling
Copy link
Member

cannot receive HTTP/2 connections

HTTP/2 is compatible with HTTP 1.1. Can you provide more details about your scenario? What kind of limitations are you encountering?

@ctreatma
Copy link
Contributor

ctreatma commented Oct 5, 2021

Our Jaeger collectors are running in Cloud Foundry, which does not support HTTP/2 or gRPC. While there is work in progress to add that support, there will likely be a fairly long delay between when a Cloud Foundry version exists that can serve gRPC traffic and when our platforms are upgraded to that version.

cloudfoundry/routing-release#200

@jpkrohling
Copy link
Member

@ctreatma, that's unfortunate. I'm convinced that we need to reinstate that exporter, with a big warning that it should only be used in corner cases like yours. Would you be open to creating a PR with the code we had?

@ctreatma
Copy link
Contributor

ctreatma commented Oct 6, 2021

Sounds good. I'll open a PR to restore the exporter & add guidelines for the rare cases when it can be used.

ctreatma added a commit to ctreatma/opentelemetry-collector-contrib that referenced this issue Oct 20, 2021
In some cases, a Jaeger collector may be deployed to a platform
that does not support gRPC.  This restores the jaegerthrifthttpexporter
so that OpenTelemetry users who need to ship traces to a Jaeger
collector that cannot support gRPC are able to ship traces using
the Jaeger collector's [Thrift over HTTP API](https://www.jaegertracing.io/docs/1.27/apis/#thrift-over-http-stable).
ctreatma added a commit to ctreatma/opentelemetry-collector-contrib that referenced this issue Oct 29, 2021
In some cases, a Jaeger collector may be deployed to a platform
that does not support gRPC.  This restores the jaegerthrifthttpexporter
so that OpenTelemetry users who need to ship traces to a Jaeger
collector that cannot support gRPC are able to ship traces using
the Jaeger collector's [Thrift over HTTP API](https://www.jaegertracing.io/docs/1.27/apis/#thrift-over-http-stable).
hex1848 pushed a commit to hex1848/opentelemetry-collector-contrib that referenced this issue Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jaeger Jaeger related issues
Projects
None yet
Development

No branches or pull requests

5 participants