Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gateway): trace context header support #256

Merged
merged 4 commits into from
Apr 11, 2023
Merged

Conversation

hacdias
Copy link
Member

@hacdias hacdias commented Apr 4, 2023

Ref.

This PR:

  • Adds tracing sub-package based on Kubo's in order to support OTEL_TRACES_EXPORTER easily across the stack. This may be removed in the future, once opentelemetry-go supports it by default.
  • Add documentation in docs/tracing.md on how to use tracing including a script on how to generate a traceparent header.
    • In this document, I show how to generate a traceparent header that can be passed onto a request. Then, the trace id from this header can be used to find the request in Jaeger. We may want to open an issue to investigate a way of "back-propagating" a tracing id to the user, such that the user can get the header without having to generate it. Perhaps trace-id. I would avoid using the exact same headers in Trace Context spec as per https://www.w3.org/TR/trace-context/#other-risks
  • Updates the gateway examples to support tracing.
  • Kubo PR: feat: boxo tracing and traceparent support kubo#9811
    • Changes Kubo to use the helper provided here, instead of duplicating code.
  • Bifrost Gateway: feat: trace context traceparent header support ipfs-inactive/bifrost-gateway#74
    • Adds tracing for the first time in Bifrost Gateway, using the functions defined here.

@codecov
Copy link

codecov bot commented Apr 4, 2023

Codecov Report

Merging #256 (e572aaf) into main (999d939) will decrease coverage by 0.79%.
The diff coverage is 23.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #256      +/-   ##
==========================================
- Coverage   48.13%   47.35%   -0.79%     
==========================================
  Files         270      270              
  Lines       33034    33115      +81     
==========================================
- Hits        15901    15680     -221     
- Misses      15470    15757     +287     
- Partials     1663     1678      +15     
Impacted Files Coverage Δ
examples/gateway/car/main.go 11.53% <0.00%> (-3.10%) ⬇️
examples/gateway/proxy/main.go 0.00% <0.00%> (ø)
tracing/exporters.go 0.00% <0.00%> (ø)
tracing/file_exporter.go 0.00% <0.00%> (ø)
examples/gateway/proxy/routing.go 11.49% <25.00%> (+1.97%) ⬆️
examples/gateway/common/tracing.go 66.66% <66.66%> (ø)
examples/gateway/common/handler.go 94.11% <100.00%> (+0.56%) ⬆️
examples/gateway/proxy/blockstore.go 47.88% <100.00%> (-1.43%) ⬇️

... and 16 files with indirect coverage changes

@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch 2 times, most recently from 4614ea9 to 53a2de0 Compare April 4, 2023 10:49
gateway/metrics.go Outdated Show resolved Hide resolved
@hacdias hacdias marked this pull request as ready for review April 4, 2023 11:15
@hacdias hacdias requested review from lidel and a team as code owners April 4, 2023 11:15
@hacdias hacdias self-assigned this Apr 4, 2023
Jorropo
Jorropo previously requested changes Apr 4, 2023
Copy link
Contributor

@Jorropo Jorropo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simpler and clearer.

examples/gateway/common/tracing.go Outdated Show resolved Hide resolved
examples/gateway/proxy/main.go Outdated Show resolved Hide resolved
gateway/metrics.go Outdated Show resolved Hide resolved
examples/gateway/common/handler.go Show resolved Hide resolved
examples/gateway/common/handler.go Show resolved Hide resolved
@Jorropo
Copy link
Contributor

Jorropo commented Apr 4, 2023

I think we are getting otel wrong, we set globals and force tracing onto everyone.

I belive we should have options where consumers must pass a tracer or propagator or ... in order to enable tracing, else it does nothing. Doing dependency injection instead of setting globals.

@hacdias
Copy link
Member Author

hacdias commented Apr 4, 2023

@Jorropo did you run the example tests with your suggestions? The header is then not present. If you manage to make it work, I will apply your suggestions.

I tried exactly what you did first, and it didn't work. That's why I did it the way I did.

@Jorropo
Copy link
Contributor

Jorropo commented Apr 4, 2023

@hacdias no but we dont expect to see a header.

We want the gateway to be a children of Saturn, Nginx front end, ... this needs to accept not emit traceparent headers.

fwiw this already work for cmds in Kubo.

@hacdias
Copy link
Member Author

hacdias commented Apr 4, 2023

@Jorropo hmm... how can we then test if it's working properly? Afaik, that's the reason it wasn't added in ipfs/kubo#9169, but then the tests from ipfs/kubo#9180 are also invalid.

@Jorropo
Copy link
Contributor

Jorropo commented Apr 4, 2023

@hacdias you can test it yourself and send a screenshot i dont think this require ci.
Ive left it out previously because i wanted to known if that is compatible with the Nginx otel tracing.

@hacdias
Copy link
Member Author

hacdias commented Apr 4, 2023

Okay, I had to read more about this. Indeed, my impression is that we should not emit headers, only accept them and pass them down.

When vendors include traceparent and tracestate headers in responses, these values may inadvertently be passed to cross-origin callers. Vendors should ensure that they include only these response headers when responding to systems that participated in the trace. - https://www.w3.org/TR/trace-context/#other-risks

(This is also the only section where responses are mentioned, hence my conclusion.)

But also, "Vendors should ensure that they include only these response headers when responding to systems that participated in the trace." is not very clear to me. If I send a traceparent to this PR with your suggestions, I don't get the header back. I think that we need a mix of your changes and my changes. I am going to try something.

--

Then, I don't understand why the initial issue for this subject was to add traceparent in the response (ipfs/kubo#9168 and ipfs/kubo#9180). CC @iand

@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch from 53a2de0 to e74d6ea Compare April 5, 2023 09:14
@hacdias hacdias changed the base branch from main to gateway/consistent-metrics April 5, 2023 09:14
@hacdias
Copy link
Member Author

hacdias commented Apr 5, 2023

After reading the specification, I think @Jorropo is right. The specification clearly states that the traceparent header should be sent top-down on the HTTP requests, that is, the gateway should accept it and pass it down to further requests. However, the gateway is not supposed to send it back in the HTTP response.

When vendors include traceparent and tracestate headers in responses, these values may inadvertently be passed to cross-origin callers. Vendors should ensure that they include only these response headers when responding to systems that participated in the trace. - https://www.w3.org/TR/trace-context/#other-risks

Therefore, there isn't much to do. This PR adds OTel tracing to the example to show how it should be added, as an example. It also changes the default HTTP Client for the examples such that we use the otelhttp transport to ensure the tracing headers are passed down. I am going to update the relevant Bifrost Gateway and Kubo PRs. However, those do not depend on this.

I want to add that we already have tracing spans in all handlers and in the IPFSBackend. I made the naming consistent in #261. The data is there. It is now the responsibility of the user of boxo/gateway whether or not they want to enable tracing or not.

@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch from 941be78 to a0392dc Compare April 5, 2023 09:42
@hacdias hacdias changed the title feat: add Trace Context HTTP headers feat(examples): add Trace Context HTTP acceptance to gateway handler Apr 5, 2023
@hacdias hacdias requested a review from Jorropo April 5, 2023 09:43
@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch from a0392dc to 6bde37f Compare April 5, 2023 10:00
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add a test for this, to protect from refactor-driven regressions in the future.

Perhaps examples/gateway/proxy? It provides good setup for this.
Enable a tracing reporter and test with and without traceparent in request, and then make assert if it was also passed to the proxy backend.

@hacdias hacdias force-pushed the gateway/consistent-metrics branch from c76cb01 to 8e43931 Compare April 6, 2023 08:56
@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch from 6bde37f to 3b5b682 Compare April 6, 2023 10:02
@hacdias hacdias requested a review from lidel April 6, 2023 10:02
@hacdias hacdias changed the base branch from gateway/consistent-metrics to main April 6, 2023 10:02
@hacdias
Copy link
Member Author

hacdias commented Apr 6, 2023

@lidel I added two tests:

  1. If the parent request includes traceparent, the down the line requests include {version}-{trace id}-. Please note that according to the spec the remaining parts can change.
  2. If there is no traceheader, one is created for requests down the line.

An important thing to note is that for all of this to work, we have to use http.NewRequestWithContext to ensure that the context with the tracing information is preserved.

@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch from 9b5e696 to 9f4acdf Compare April 11, 2023 09:21
@hacdias hacdias marked this pull request as draft April 11, 2023 10:48
@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch from b5b6989 to 0ca7e90 Compare April 11, 2023 11:05
@hacdias hacdias changed the title feat(examples): add Trace Context HTTP acceptance to gateway handler feat: add tracing sub-package and gateway examples Apr 11, 2023
@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch 2 times, most recently from c2ae3fc to 2127127 Compare April 11, 2023 11:29
@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch from 2127127 to a56c841 Compare April 11, 2023 11:34
@hacdias hacdias marked this pull request as ready for review April 11, 2023 11:47
@hacdias
Copy link
Member Author

hacdias commented Apr 11, 2023

I updated the main comment #256 (comment) with the current information.

docs/tracing.md Outdated Show resolved Hide resolved
docs/tracing.md Outdated Show resolved Hide resolved
@hacdias hacdias requested a review from guseggert April 11, 2023 12:48
@hacdias hacdias force-pushed the issue/bifrost-gateway/68 branch from bf988f2 to a1d9f98 Compare April 11, 2023 12:53
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took this for a spin with Jaeger UI in both Kubo (ipfs/kubo#9811) and bifrost-gateway (ipfs-inactive/bifrost-gateway#74) and as far i could tell works as expected.

Kubo bifrost-gw with GRAPH_BACKEND=true
Screenshot 2023-04-12 at 00-52-42 Jaeger UI Screenshot 2023-04-12 at 01-02-57 Jaeger UI

Merging so we can leverage traceparent in bifrost-gateway for Rhea, but also can start leveraging it in places like IPNI HTTP client at https://github.com/ipfs/boxo/tree/main/routing/http/client

docs/tracing.md Outdated Show resolved Hide resolved
@lidel lidel merged commit 5d6c73c into main Apr 11, 2023
@lidel lidel deleted the issue/bifrost-gateway/68 branch April 11, 2023 23:29
@lidel lidel changed the title feat: add tracing sub-package and gateway examples feat(gateway): trace context header support Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants