Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify uninstrumented services #659

Merged
merged 1 commit into from
Nov 11, 2020

Conversation

rubenvp8510
Copy link
Collaborator

@rubenvp8510 rubenvp8510 commented Nov 11, 2020

Signed-off-by: Ruben Vargas [email protected]

Which problem is this PR solving?

Short description of the changes

Implement proposed visualization of external/uninstrumented spans mentioned on #594, this is work in progress I'm not sure if the logic to detect such cases is right. This PR identifies an uninstrumented service checking spans that meets the following criteria:

  • Leaf span
  • span.kind == client
  • peer.service tag present

Is this the correct way? or may be I'm missing something? Once we agreed on the logic I can add tests.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Could you add a screenshot?

@@ -117,4 +117,7 @@ export function findServerChildSpan(spans: Span[]) {
return null;
}

export const isKindClient = (span: Span) =>
span.tags.find(({ key, value }) => key === 'span.kind' && value === 'client');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
span.tags.find(({ key, value }) => key === 'span.kind' && value === 'client');
span.tags.some(({ key, value }) => key === 'span.kind' && value === 'client');

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@codecov
Copy link

codecov bot commented Nov 11, 2020

Codecov Report

Merging #659 (726b86e) into master (b07aab8) will increase coverage by 0.07%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #659      +/-   ##
==========================================
+ Coverage   94.20%   94.28%   +0.07%     
==========================================
  Files         228      228              
  Lines        5923     5930       +7     
  Branches     1489     1492       +3     
==========================================
+ Hits         5580     5591      +11     
+ Misses        304      302       -2     
+ Partials       39       37       -2     
Impacted Files Coverage Δ
...nents/TracePage/TraceTimelineViewer/SpanBarRow.tsx 79.31% <100.00%> (+15.02%) ⬆️
...ePage/TraceTimelineViewer/VirtualizedTraceView.tsx 95.23% <100.00%> (+0.15%) ⬆️
...components/TracePage/TraceTimelineViewer/utils.tsx 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b07aab8...726b86e. Read the comment docs.

@@ -117,4 +117,7 @@ export function findServerChildSpan(spans: Span[]) {
return null;
}

export const isKindClient = (span: Span) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export const isKindClient = (span: Span) =>
export function isKindClient(span: Span): Boolean =>

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be

export const isKindClient = (span: Span):Boolean =>
  span.tags.some(({ key, value }) => key === 'span.kind' && value === 'client');

or

export function isKindClient(span: Span):Boolean {
  return span.tags.some(({ key, value }) => key === 'span.kind' && value === 'client');
}

it's a matter of taste, I would prefer the first option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a matter of taste

I know :-) Joe and Everett also use const a lot - I don't get it. When defining a function why not use the function syntax? Lambdas are harder to read. I guess you're saving on typing return.

Anyway, I just mostly wanted the return type to be declared.

@rubenvp8510
Copy link
Collaborator Author

Screenshot from 2020-11-10 20-10-30

I took this screen shot form hotrod example, but it looks a little bit weird, because mysql service calling uninstrumented mysql service?

@yurishkuro
Copy link
Member

it looks a little bit weird, because mysql service calling uninstrumented mysql service?

yeah, the hotrod fakes the leaves. But it's ok, thanks for the s/s.

@objectiser
Copy link
Contributor

@rubenvp8510 Difficult to tell as the client/server spans are collapsed, but if that wasn't the case, would the 'fake' server span be visually represented in a different way (to indicate it is added)? Also, is the duration of that span the same as the client? If visually clear that is not real, then maybe that won't matter, otherwise might cause confusion (i.e. zero network latency).

@rubenvp8510
Copy link
Collaborator Author

rubenvp8510 commented Nov 11, 2020

@rubenvp8510 Difficult to tell as the client/server spans are collapsed, but if that wasn't the case, would the 'fake' server span be visually represented in a different way (to indicate it is added)? Also, is the duration of that span the same as the client? If visually clear that is not real, then maybe that won't matter, otherwise might cause confusion (i.e. zero network latency).

@objectiser Maybe there is confusion here, but the client/server spans are not collapsed, there is no server span in this case "service1 -> service2" is just an indicator that the span doing a request to service (in this case service2) out of the scope of the current jaeger instrumentation. If you see the screen-shot there is no left arrow on MySQL service span that indicates that span has a child.

May be we need a better way to differentiate this case vs collapsed client/server spans? We can put an icon indicator in some place.

@objectiser
Copy link
Contributor

@rubenvp8510 After seeing @yurishkuro 's suggest I realised that was probably the case - but I think my assumption that it was a collapsed will probably be the same as others, and they will report a bug that they are not able to expand it 😄 - so think it might be good to have some differentiation from the normal case.

@yurishkuro
Copy link
Member

I am not sure many people even know about the original "->" display mode. I am not too concerned that non-collapsible leaf will be confusing.

@rubenvp8510
Copy link
Collaborator Author

We can try this approach and if we see there is a confusion we can add more emphasis (with an icon or different arrow) on the fact that this is the case of an external service.

@yurishkuro
Copy link
Member

yes, let's complete this PR

@rubenvp8510 rubenvp8510 changed the title [WIP] Identify uninstrumented services Identify uninstrumented services Nov 11, 2020
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@yurishkuro yurishkuro merged commit df17e8d into jaegertracing:master Nov 11, 2020
@CodeBlanch
Copy link

@yurishkuro @rubenvp8510 Thanks for this! Should we expand the rule to cover span.kind == producer? In OpenTelemetry .NET we make sure peer.service is set on client & producer spans:

https://github.com/open-telemetry/opentelemetry-dotnet/blob/1a308a6bd21a86255c43bd09ab8aaf00bf691aa5/src/OpenTelemetry.Exporter.Jaeger/Implementation/JaegerActivityExtensions.cs#L56

@yurishkuro
Copy link
Member

@CodeBlanch yes, good call, both "client" and "producer" can be treated the same.

@yurishkuro
Copy link
Member

btw, @CodeBlanch , could you test this change with the types of instrumentation you have? Ruben was only able to test with HotROD, which does not really have non-instrumented leaves in the graph (it fakes them as instrumented).

@CodeBlanch
Copy link

@yurishkuro Yes, happy to. How do I get the latest bits? Normally I use the "all-in-one" image to test Jaeger.

@yurishkuro
Copy link
Member

I think it can only be done by building all-in-one from source. Let me do this instead: jaegertracing/jaeger#2626

@yurishkuro
Copy link
Member

@CodeBlanch the latest all-in-one contains this fix (make sure to remove whatever goes as latest in your local Docker first).

@CodeBlanch
Copy link

Thanks @yurishkuro! Here's how it looks...

  • Previous version sending API-compliant data (1 process/batch):
    image

  • Previous version sending non-API-compliant data (1 process/batch per service):
    image

  • New version sending API-compliant data (1 process/batch):
    image

Note: Test service calls itself for some of the cases.

Question: Could we color the bars to match the peer.service? I personally think the middle one is still the best presentation, but if we can color the bars I think the new one is good. Definately much better than the first for uninstrumented services.

Idea: It shows "OpenTelemetry Exporter" a lot. What if instead of showing the color stripe, process service name, and the arrow on all the children we just had the color stripe and the arrow? Might save some real estate and convey the same info? Or perhaps just the arrow? 🤷

/cc @ndrwrbgs @cijothomas

@ndrwrbgs
Copy link

I missed that the change didn't do the bars, oops! My customers say are the most useful part for their consumption. Is this something we can do?

@ndrwrbgs
Copy link

@CodeBlanch tangent, but its resolution may impact the definition of "correct" for solutions -- what's going on with this span?
image

If the external dependency isn't traced, how can something (a call back to us) nest inside the call to it?

@CodeBlanch
Copy link

@ndrwrbgs That's a quirk of the test I'm using. I have a service that does a few things: 1) Calls google, 2) Calls itself with invalid routes, 3) Calls itself and returns responses, 4) Calls itself and then calls Google, returning the response. Any call to itself is reported to Jaeger, any call to Google is not reported.

Code is here: https://github.com/open-telemetry/opentelemetry-dotnet/blob/master/examples/AspNet/Controllers/WeatherForecastController.cs

vvvprabhakar pushed a commit to vvvprabhakar/jaeger-ui that referenced this pull request Jul 4, 2021
Signed-off-by: Ruben Vargas <[email protected]>
Signed-off-by: vvvprabhakar <[email protected]>
vvvprabhakar pushed a commit to vvvprabhakar/jaeger-ui that referenced this pull request Jul 4, 2021
Signed-off-by: Ruben Vargas <[email protected]>
Signed-off-by: vvvprabhakar <[email protected]>
vvvprabhakar pushed a commit to vvvprabhakar/jaeger-ui that referenced this pull request Jul 5, 2021
Signed-off-by: Ruben Vargas <[email protected]>
Signed-off-by: vvvprabhakar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Identify dependencies in the UI when remote is not instrumented
5 participants