-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace query API #252
Trace query API #252
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good, thanks @esnible - just a few suggestions/questions from my side
I think would be nice to also add a traces_test.go
?
|
||
// The <base href=> tag generated by Jaeger to tell the UI where to fetch static | ||
// assets and query /api | ||
const expectedBaseTag = `<base href="/" data-inject-target="BASE_URL"/>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is Jaeger used by Obs API in a specific version? or would we expect the <base href=..>
tag to be changed at some point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Jaeger API is versioned, but the UI is not. This handler is for the UI.
I want to bring @pavolloffay into this discussion. Jaeger does something weird, which was introduced in jaegertracing/jaeger-ui#42 . The introduced <base>
to support reverse proxies. I don't like their solution and I am doing something of the opposite.
To support some reverse proxy, Jaeger introduced a mechanism where the path known to the proxy is told to Jaeger. So if I have two tenants, tenant1
and tenant2
, I can start a Jaeger instance that is told the base URL is /api/traces/v1/tenant1
and another that is told the base URL is /api/traces/v1/tenant2
. The reverse proxy is responsible for remembering the host (or IP) of each tenant and forwarding properly.
What Observatorium does with this PR is offer the endpoint {host}/api/traces/v1/{tenant}/{suffix} and proxy adding header x-tenant: ...
to {jaeger}/{suffix}. All queries are sent to the same --traces.read.endpoint={host:port}
The approach I am taking will be great when Jaeger has a multi-tenant back-end. This is in-plan, see jaegertracing/jaeger#3427 . (We might need a temporary work-around to get to that approach, and I plan to start on that work next).
The other approach is that when we add multitenancy to Jaeger we have Jaeger itself alter the BASE_URL instead of doing it in jaegerUIResponseModifier()
. If that approach is taken we still don't need to version the <base>
tag, we merely delete jaegerUIResponseModifier() from the Observatorium/API code.
@pavolloffay , Does the Jaeger community want to pass the tenant to the UI as a segment in the path or in a header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we don't have control if this tag gets updated in Jaeger UI right? I think then the response won't get properly modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavolloffay , Does the Jaeger community want to pass the tenant to the UI as a segment in the path or in a header?
There was no decision on this so far. But I think it will be a header to avoid changing the current APIs.
What Observatorium does with this PR is offer the endpoint {host}/api/traces/v1/{tenant}/{suffix} and proxy adding header x-tenant: ... to {jaeger}/{suffix}. All queries are sent to the same --traces.read.endpoint={host:port}
As you mentioned this is the final goal but it does not work with the Jaeger at the moment as there is no (soft) multitenancy support in Jaeger. As a temporary solution the API service could use tenant name/id as part (prefix) of the endpoint for the read endpoint.
@@ -18,7 +18,7 @@ type contextKey string | |||
const ( | |||
prefixKey contextKey = "prefix" | |||
|
|||
prefixHeader string = "X-Forwarded-Prefix" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to export this in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm trying to do is have Observatorium replace <base href="/">
with <base href="/api/traces/v1/{tenant}/">
for the reasons I explain at #252 (comment)
In the place I am doing this inside of httputil.ReverseProxy.ModifyResponse, and at this point the HTTP request Context doesn't have the prefix that it used. I could try to reconstruct the value applied by MiddlewareSetPrefixHeader()
, but instead of doing that I just lookup that value Observatorium supplied for X-Forwarded-Prefix
which happens to be exactly the value /api/traces/v1/{tenant}/ that I need to make the UI work.
I don't need to get the value from here, I could just use the literal "X-Forwarded-Prefix"
myself. Or we can talk about the whole approach. What is your preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have misunderstood. I only meant if we need have that constant accessible from other packages (i.e. the constant name is capitalized), if we don't use the constant outside of here.
main.go
Outdated
), | ||
) | ||
|
||
// Static UI (.js and .css), not protected by RBAC -- any user can see UI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if there's any reason we do not want to enforce authorization specifically on UI?
Also I'm wondering if following the convention we have with other signals would make sense here as well, i.e. having everything organize under one handler instead of splitting it into API, UI etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I go to http://localhost:8080/api/traces/v1/test-oidc/search/ I am redirected to http://localhost:5556/dex/auth/local/login?back=&state=vcqh2yns46maemaiyfh2k33xm where I am presented with a Dex login screen. So we do enforce authentication through the r.Use(authentication.WithAccessToken())
earlier in the function.
I felt like the RBAC permission for traces for for reading and writing traces, not UI. By not doing RBAC the UI comes up the UI tries to fetch trace "service names", and the user sees the Unauthorized message within a nice-looking Jaeger UI. Do you want me to add RBAC here to block the UI from coming up at all?
I originally used a single handler until I chose to use httputil.ReverseProxy.ModifyResponse
. I didn't want that logic to run over the /static resources of the Jaeger UI, only the generated HTML. I split into two handlers to make that logic more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation Ed. I get splitting the handlers, and yes, we do authentication, but I'm still wondering about the authorization. Can you elaborate on what you mean by the user will see Jaeger UI with unauthorized message? Where will this authorization check happen? I'm still kind of leaning more towards covering UI with authorization fully. I can think of theoretical example, where a tenant has e.g. only metrics allowed, but some other tenants might have both metrics and traces allowed. In such cases, what is the value of having that UI accessible to unauthorized tenant? For me it would be cleaner to cut off the tenant completely and prevent any kind of potential 'snooping', if they don't have authorization in the first place. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By not doing RBAC for the UI resources we allow the UI to load and make a failing API query for traces.
If I remove RBAC permission for traces the UI for http://localhost:8080/api/traces/v1/test-oidc/search/ looks like this:
I thought it was cute to allow it to work this way. I am happy to secure the UI resources behind RBAC permission to read traces if that is your request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just trying to think what's the most reasonable solution here, but I'm open to other opinions. I'm currently leaning more towards authorizing on all endpoints, since if the tenants is not authorized to read / write trace signals, there's no reason for the tenant to even be accessing the UI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense what @matej-g proposes - also it may avoid unnecessary loading of the UI + trial to fetch traces. So maybe is better to consistently to add authorization on all endpoints
NewHandler(labels prometheus.Labels, handler http.Handler) http.HandlerFunc | ||
} | ||
|
||
type nopInstrumentHandler struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is "mandatory" to have this type? I would avoid it if it is optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavolloffay I didn't write this code. Exactly the same code is in the log handler api/logs/v1/http.go and the metrics handler api/metrics/v1/http.go. I just copied it from there so that this code would be as close as possible to the rest of Observatorium.
@matej-g can you address this? Keep in or remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well how would the code look like if it was removed? I guess it's simply the default no-op option for instrumentation handler.
I'd say we could, again, improve on this, as all 3 signals use virtually the same code, but that's a question for a refactoring PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
main.go
Outdated
r.Use(authentication.WithTenantHeader(cfg.traces.tenantHeader, tenantIDs)) | ||
|
||
// V2 query API, protected by RBAC | ||
r.Mount("/api/traces/v1/{tenant}/api", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking closely, I'm wondering why we're departing here from how the paths are mounted for metrics and traces. If I'm looking at the code correctly, we're mounting these paths, but we're handling everything on them via /*
. Why aren't we handling only specific paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this comment. I have restricted things.
Now the API handler only proxies GET for /api/traces, /api/dependencies, and /api/services. The UI and static asset handlers still take /* but only proxy GET.
I am a bit confused about the code that I needed to do this. I am doing stripTenantPrefix("/api/traces/v1"()
then
r.Get("/traces*", c.instrument.NewHandler(
prometheus.Labels{"group": "tracesv1api", "handler": "api"},
proxyRead))
I do not understand why it works when I use r.Get("/traces*"...)
. I had expected to be doing `r.Get("/api/traces*"...)
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Signed-off-by: Ed Snible <[email protected]>
Let me give testing instructions. There are two ways to test this. One way is to run The fun, interactive way that I test is by running three programs in three windows:
(The config files needed for the above can be copied out of a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good to me now, stellar job @esnible! ⭐
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds proxying of the Jaeger V2 HTTP query API and the Jaeger UI assets.
With this change, the Jaeger UI is available at http://localhost:8080/api/traces/v1/test-oidc/search/
Note that the first time logging in, it redirects away from traces. Not sure if that is correct or not.
Note that there are currently no multi-tenant backends for Jaeger, so the UI currently sees all traces. The work to divide the tenants will be done in Jaeger itself or another front-end, probably not in observatorium/api.