-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add request metrics to System.Net.Http #85447
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @dotnet/ncl Issue DetailsAddresses #84978. Work in progress. Adds:
|
src/libraries/System.Net.Http/src/System/Net/Http/HttpMetrics.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if @MihaZupan has any thoughts?
HttpTelemetry.Log.RequestStart(request); | ||
metrics.RequestStart(request); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this, we will face all the overhead of both HttpTelemetry
and Metrics
if any of them is enabled.
I would do a second check for HttpTelemetry.Log.IsEnabled()
, and fine-grained checks of <instrument>.Enabled
within HttpMetrics
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make sure these checks exist in HttpTelemetry
as well?
That is, change places like RequestStart/Stop
in HttpTelemetry
to
Interlocked.Increment(ref _startedRequests);
if (IsEnabled(...))
{
RequestStart(
request.RequestUri.Scheme,
request.RequestUri.IdnHost...);
}
The counter fields should continue to be updated, but we don't have to read the Uri properties/serialize the event if the EventSource isn't enabled.
|
||
if (response is not null) | ||
{ | ||
tags.Add("status-code", (int)response.StatusCode); // Boxing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TagList
looks like a boxing factory. I wounder if this concern was considered during the design of #56936. If it's concerning enough we may consider caching the object
s, but that would come with its' own overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option is having a cache of boxed commonly used status codes.
if (response is not null) | ||
{ | ||
tags.Add("status-code", (int)response.StatusCode); // Boxing? | ||
tags.Add("protocol", $"HTTP/{response.Version}"); // Hacky |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hacky because it relies on formatting implementation? Shall we do something like this to avoid the formatting allocation?
static string GetProtocolName6(Version httpVersion) => (httpVersion.Major, httpVersion.Minor) switch
{
(1, 1) => "HTTP/1.1",
(2, 0) => "HTTP/2.0",
(3, 0) => "HTTP/3.0",
_ => "unknown"
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hacky because it allocates, some values aren't right*, and it's simple to improve such as your snippet.
*HTTP/2
would be better than HTTP/2.0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid allocation. We wouldn't want our services to pay the allocation cost for something which our library will anyways drop and not emit. Protocol is one of them
#pragma warning disable SA1129 // Do not use default value type constructor | ||
var tags = new TagList(); | ||
#pragma warning restore SA1129 // Do not use default value type constructor | ||
InitializeCommonTags(ref tags, request); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of redirects you may log different tags here than at RequestStart
, we need to snapshot the initial values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Properties on HttpRequestMessage
are updated with a response redirect?
In that case, what request values should be used?
Options:
- Don't use the initial request values anywhere. That means we don't include tags on
current-requests
and have a dimensionless counter. Therequest-duration
counter uses theHttpRequestMessage
state based on whenSendAsync
returns. - Use initial request values with
current-requests
andrequest-duration
. Ignore request message state afterSendAsync
returns. Maybe have an extra counter to check if the URL changes between sending the request and receiving the response to record redirect counts. - A mix. Initial values on
current-requests
and potentially updated values onrequest-duration
.
No perfect choice. I like option 2, but curious about what client networking experts think and what current diagnostics does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Properties on
HttpRequestMessage
are updated with a response redirect?
Yes, the request is being reused to save allocations, but I think this is a fundamental problem, regardless of our implementation.
what current diagnostics does.
I don't think this concern exists in case of existing counters. The only ones where we have something similar to tags are the <protocol>-connections-current-total
and the <protocol>-requests-queue-duration
counters. In both cases the protocol version is clear at the moment we increment the counter.
Use initial request values with
current-requests
andrequest-duration
. Ignore request message state afterSendAsync
returns.
Wouldn't this be unexpected to users? For example by using HttpVersionPolicy.RequestVersionOrLower/RequestVersionOrHigher
and negotiating lower/higher versions, a user would not log the actual HTTP versions being used.
No perfect choice.
Isn't it also an option to record multiple requests on redirects? (Even though it means that the implementation would be bound to SocketsHttpHandler
and other handlers would need to implement the metrics separately.)
I have opened open-telemetry/opentelemetry-specification#3462, hoping that OTel folks can help clarifying this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this concern exists in case of existing counters.
Yeah existing counters don't have request-specific info on them, but I was thinking about what is in logs and the activity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Anton that relying on the request message's version can be quite misleading.
YARP for example uses a default version of 2.0 with a RequestVersionOrLower
policy.
If we only use the request version, the user would only see 2.0 counters even if they're actually only talking 1.1 to the server.
The unfortunate thing is that we'll only know what version is actually used when we try sending the request to the wire internally, or from a higher-level perspective, once you get back the response message object.
It may make sense to consider moving this logic further down into SocketsHttpHandler
, especially if you will also want to track when the response content has been read completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the version isn't known when a request starts, it's not included as a tag on current-requests
. And request-duration
uses HttpResponseMessage.Version
for this information. This PR already has that behavior.
{ | ||
// TODO: Should the Meter and HttpMetrics be static and shared by default? | ||
get => _meter ??= new Meter("System.Net.Http"); | ||
set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tarekgh @JamesNK - I mentioned in an earlier email but it may have gotten missed, what do you guys think about consuming an IMeterFactory instead of a Meter? Then the Meter can be produced:
_meter = _factory == null ? new Meter("System.Net.Http") : factory.CreateMeter("System.Net.Http");
This eliminates the need for the caller to understand what meter name is expected.
This implies that IMeterFactory gets defined in System.Diagnostics.DiagnosticSource.dll which I don't think would be an issue but I might be overlooking something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting IMeterFactory to be in Microsoft.Extensions.Metrics and so not available here. That's part of why I chose this approach.
The other reason for exposing the meter is it makes consuming the metrics for just one client quite simple:
var invoker = new HttpMessageInvoker();
var currentRequestsRecorder = new InstrumentRecorder<double>(invoker.Meter, "request-duration");
But that only works if each client/invoker has its own Meter and counters copy, which might not be desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting IMeterFactory to be in Microsoft.Extensions.Metrics and so not available here
Initially I imagined it there too. Do you feel like it is important for IMeterFactory to not be in System.Diagostics.Metrics?
The other reason for exposing the meter is it makes consuming the metrics for just one client quite simple:
If users were injecting a factory instead the pattern wouldn't be much different:
var IMeterFactory factory = ...
var invoker = new HttpMessageInvoker();
invoker.Factory = factory; // separately, maybe we should be using a constructor overload rather than a property setter?
var currentRequestsRecorder = new InstrumentRecorder<double>(factory, "System.Net.Http", "request-duration");
If the app developer wasn't injecting the factory and using the default Meter then I wouldn't expect them to write tests for that as it would effectively just be testing runtime behavior and not their own code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can also be a Func<Meter>
on HttpMessageInvoker
, so then we don't have to worry about the dependencies, and IMeterFactory
(if an interface is needed at all) can live in Microsoft.Extensions.*
where the DI-friendly things go usually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically IMeterFactory
naturally belongs to the Microsoft.extensions more because we'll have another extension method to register this interface to the DI. But technically nothing can stop us having it in DianosticSource library. I am thinking in the future when exposing more stuff like MeterBuilder or anything else, at that time it will look strange having IMeterFactory in one place and other related interfaces in other place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can also be a Func<Meter> on HttpMessageInvoker
Func<string,Meter>
is the signature we'd need for a similar effect as IMeterFactory, but thats no obstacle. My other motivation for IMeterFactory instead of something like Func is so that .NET types are following the same guidance we are planning to give to everyone else. Its not written up in our official docs yet, but the basic pattern is:
- Define a parameter on your constructor of type IMeterFactory
- Use the factory to create the Meter.
While other patterns are certainly possible (such as passing in a Meter or passing in a Func<string,Meter>), many of them aren't as easy to use with DI. I don't expect folks will be creating HttpMessageInvokers from a DI container so its fair to argue it doesn't matter if we pick a pattern that is DI-friendly. I think the counter-argument is that if the DI-friendly pattern becomes a convention then devs will anticipate that pattern regardless whether the type is created from a DI-container and APIs that don't follow the pattern feel unexpected and awkward. For example most people expect that if your type logs via ILogger then your constructor has an ILogger constructor parameter. Writing a class that has a method SetLogger(ILogger)
or a constructor parameter Func<ILogger>
functionally would work too, but it looks unusual because it doesn't follow the convention.
Logically IMeterFactory naturally belongs to the Microsoft.extensions
Tarek and I chatted earlier. I agree that it felt intuitive for me as well initially for IMeterFactory to be in the Microsoft.Extensions namespace, but I don't think its the only reasonable association. The general guidance with DI is that the types being created from the DI container should not have any tight coupling to the DI container (for example don't have your constructor use IServiceProvider as input argument). This allows using those types without DI because a caller could always manually instantiate the same arguments. This perspective suggests that any of the types that show up in the constructor parameters, such as IMeterFactory, should not be treated as tightly coupled to DI. If the type isn't coupled to DI the fact that it would land in Microsoft.Extensions when all the other Meter types are in System.Diagnostics didn't feel so intuitive any more.
Ultimately I assume its rare for .NET developers to manually create an HttpMessageInvoker so if folks think its not helpful to follow the IMeterFactory pattern on this one I'll be fine with it. If the pattern is also going to show up on HttpClient I might push a little harder due to more exposure, but still I expect the main usage for injecting a Meter will come from HttpClientFactory internally rather than .NET devs manually injecting one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately I assume its rare for .NET developers to manually create an HttpMessageInvoker [...]. If the pattern is also going to show up on HttpClient [...]
HttpMessageInvoker
is the base class for HttpClient
so everything that goes into HttpMessageInvoker
goes into HttpClient
. I'm treating them as synonyms in this discussion.
Define a parameter on your constructor of type IMeterFactory
The pattern with HttpClient
is to consume configuration/dependencies through properties; the only parameter being passed to the constructor is the HttpMessageHandler
. It would look very weird if new ctr. overloads would appear taking an IMeterFactory
, especially if most users don't need it. (In general, System.Net.Http
wasn't designed to be DI-friendly, so making a occasional intrusive decisions following a different design approach would create too much disharmony IMO.)
I expect the main usage for injecting a Meter will come from HttpClientFactory internally rather than .NET devs manually injecting one
In that case, wouldn't it be straightforward to make HttpClientFactory
's responsibility to wrap IMeterFactory
into a Func<string, Meter>
when initializing HttpClient
?
If the type isn't coupled to DI the fact that it would land in Microsoft.Extensions when all the other Meter types are in System.Diagnostics didn't feel so intuitive any more.
Even if it's not tightly coupled to DI, the question is if there are any other reasons to make it land in System.Diagnostics
other than making DI use-cases easier? Also: are there any other System.*
libraries that may end up consuming IMeterFactory
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately I assume its rare for .NET developers to manually create an HttpMessageInvoker so if folks think its not helpful to follow the IMeterFactory pattern on this one I'll be fine with it. If the pattern is also going to show up on HttpClient I might push a little harder due to more exposure, but still I expect the main usage for injecting a Meter will come from HttpClientFactory internally rather than .NET devs manually injecting one.
As @antonfirsov mentioned, HttpMessageInvoker
= HttpClient
. The client inherits from the invoker. The design needs to work for both.
Manually creating HttpClient
is very common. I expect customizing the meter on a client will be much rarer. The primary case for using it will be Microsoft.Extensions.Http
.
The way I imagined this hanging together is Microsoft.Extensions.Http
has a dependency on Microsoft.Extensions.Metrics
. IMeterFactory
is injected into HttpClientFactory
and sets a meter on the created HttpClient from the factory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HttpMessageInvoker is the base class for HttpClient
Ah! I didn't catch that relationship. Thanks for the important info.
The pattern with HttpClient is to consume configuration/dependencies through properties
OK. If HttpClient already has that pattern then I guess we have clashing patterns, but it probably makes sense for HttpClient's pattern to take priority.
In that case, wouldn't it be straightforward to make HttpClientFactory's responsibility to wrap IMeterFactory into a Func<string, Meter> when initializing HttpClient
I agree it doesn't take a lot of code to do that wrapping, but assume it was already the plan that IMeterFactory was in System.Diagnostics.DiagnosticSource.dll - would there be any reason to wrap it in the first place?
are there any other System.* libraries that may end up consuming IMeterFactory?
I don't have any concrete plan to point to, but other forms of IO like files or pipes seems like potential candidates for metrics in the System.* namespaces. Its also possible we'd want System.Runtime metrics reported via a factory-produced Meter.
the question is if there are any other reasons to make it land in System.Diagnostics other than making DI use-cases easier
I don't think we need more reasons to meet the bar, but these are the pros/cons as I see it:
- [Pro]: More libraries will have IMeterFactory accessible to them if they aren't able to take a dependency on Microsoft.Extensions. Even in libraries where there isn't a strict rule against such dependencies, library authors are still likely to prefer fewer dependencies.
- [Pro]: Assuming we also have IMeterFactory in the System.Diagnostics.Metrics namespace then consumers only need one namespace in their code rather than two.
- [Con?]: There is the idea that any type used to make the DI scenario easier (IMeterFactory) belongs in M.E.something.dll. However aside from the idea feeling natural, I haven't seen any concrete benefit to organizing it that way in this case.
When I look at the balance scales I see it as small upside and effectively no downside.
} | ||
if (request.HasTags) | ||
{ | ||
foreach (var customTag in request.MetricsTags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to HttpRequest properties not being stable, these tags might also not be stable across the duration of the request. Your EnrichmentHandler test is an example that deliberately changes them.
I think one scenario we need to sort out is the case where the caller to HttpClient is some 3rd party library that isn't specifying tags but the app developer does want to get all their traffic classified. Currently R9 is handling this case by having the app developer create a mapping from outbound URIs -> other metadata, then they register a handler in the HttpClientFactory to execute that mapping and generate the tagged telemetry. If R9 is going to switch over to use these built-in metrics rather than continuing to add their own then they will need some way to dynamically inject those tags. If the injection mechanism will continue to be a handler (similar to your EnrichmentHandler), then perhaps we need to move the metric collection point back to the DiagnosticsHandler where it is now so that other handlers have a chance to update those tags before they are captured. (I know I suggested doing it at the front earlier, I'm not sure my idea was a good one now that I consider this scenario)
@dpk83 - you probably want to take a look at this if you haven't already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question, will the same metrics support be added for net framework or is it limited to net core only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the concern here. R9 has a DelegatingHandler that is responsible for recording to the metric. The handler also executes enrichers to build up a collection of custom tags.
The difference in this prototype is recording to the metric and gathering custom tags happens in different places. The handler writes them onto the HttpRequestMessage
, and they are read off the message later when needed.
My main concern with this approach is making sure there is the opportunity for an enrichment handler to add tags after SendAsync
has returned. Right now the metrics are recorded in the HttpMessageInvoker
. That means the code after SendAsync
in each delegating handler runs before the counter is recorded. Metrics at this layer works, but it doesn't record separate HTTP requests if a Polly handler is added for retries. The solution is to move metrics to DiaganosticsHandler
. That handler is always inside a retry handler, so each HTTP request is recorded, but we lose the ability to do enrichment after SendAsync
returns. DiaganosticsHandler
is added after any user-injected handlers (such as enrichment), so metrics are recorded before code after SendAsync
runs.
A solution that supports both goals could be to wait to record the HTTP duration counter value when an HTTP request has completed both of these actions:
- The HttpRequestMessage exits the HTTP pipeline
- The HttpResponseMessage finishes reading the response
It would be slightly odd that DiaganosticsHandler
"starts" metrics for an HTTP request (e.g. starts the request duration timer, increments current request count), and exiting HttpMessageInvoker
"finishes" it (e.g. records request duration with enriched tags and decreases current request count).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for confusion, upon re-reading the code I see that I misunderstood it earlier and my comment was based on that misunderstanding. I was incorrectly worried that custom tags would cause the current-requests increment operation to be recorded with different tags than the decrement which would unbalance it. Now I see that custom tags are only used for the duration measurement and my concern was unfounded.
While I do still care about the scenario I described above with 3rd party libraries, I think this design would be able to handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few points:
-
Quite often caller doesn't have access to the HttpRequestMessage, third party library is a very good example of this scenario. In complex organizations where there are hundreds and thousands of services a common libraries are used quite a lot and this option just doesn't work. We devised IDownstreamDependencyMetadata for that very purpose.
-
We are now supporting both DelegatingHandler based auto collection of metrics as well as DiagnosticsSource based approach because DelegatingHandler wasn't sufficient for a lot of cases.
-
In order to consume this metrics directly emitted from .NET and provide the transformation and the dependency metadata looks like we will still continue to use DelegatingHandler and DiagnosticsSource so I don't see how we can take consume and build onto the metrics this component is going to emit without incurring additional perf overhead.
-
Is this support is going to be ported back for .NET Framework?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution is to move metrics to
DiaganosticsHandler
Note that DiagnosticsHandler
is now strongly coupled with distributed tracing, and disabling distributed tracing either by by setting setting DOTNET_SYSTEM_NET_HTTP_ENABLEACTIVITYPROPAGATION=0
or by setting socketsHttpHandler.ActivityHeadersPropagator = null
would completely disable it.
Wouldn't it be better to implement metrics in a separate MetricsHandler
that would live either before or after DiagnosticsHandler
in the chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know about that switch. Good to know.
If a separate handler is better, then do it. The difference we're considering is the same: counters at the HttpMessageInvoker layer vs counters at an inner handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how we can take consume and build onto the metrics this component is going to emit without incurring additional perf overhead.
What perf overhead are you worried about @dpk83?
While I don't think R9 strictly has to use this counter for correct behavior, from the consumer's perspective I'd really hope to describe the R9 stuff to 3P devs as an enhancement and not another orthogonal telemetry mechanism. I am leaving out here 1P back-compat where we probably want a different schema. For example I don't think it would be good if 3P devs see duplicate measurements. That would mean that R9 either needs to add tags to this instrument or suppress the instrument and replicate it, including the new enrichment tags. Making sure an alternate implementation stays in-sync feels possible, but error-prone and more work intensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question, will the same metrics support be added for net framework or is it limited to net core only?
I think this got answered during our meeting earlier, but reproducing it here - the new metrics would only be present in .NET 8 onwards. This code doesn't flow downlevel either to framework or to older versions of .NET Core.
private Meter? _meter; | ||
private HttpMetrics? _metrics; | ||
|
||
#pragma warning disable CS3003 // Type is not CLS-compliant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to use [CLSCompliant(false)]
instead of these pragmas
public Meter Meter | ||
{ | ||
// TODO: Should the Meter and HttpMetrics be static and shared by default? | ||
get => _meter ??= new Meter("System.Net.Http"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should make sure only one instance is ever created in a race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the property isn't thread-safe, but how much do we care about thread safety? It should only be set when the message invoker is created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I read it this field may be null
when you start sending requests, and the SendAsync
will lazy-init it, no?
If you try sending multiple requests at the same time, you will end up creating more than one instance with the ??=
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go farther, we should use a static so there is only one instance (in the non-DI case). Creating a new instance for every client adds a bunch of unnecessary overhead.
[MemberNotNull(nameof(_metrics))] | ||
private void EnsureMetrics() | ||
{ | ||
_metrics ??= new HttpMetrics(Meter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to Meter
, this should guard against using different instances in a race
|
||
public void RequestStop(HttpRequestMessage request, HttpResponseMessage? response, Exception? unhandledException, long startTimestamp, long currentTimestamp) | ||
{ | ||
if (_currentRequests.Enabled || _requestsDuration.Enabled) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should likely remember and match the Enabled
decision made in RequestStart
, otherwise the numbers may be off if you a) start some requests b) enable metrics c) those requests finish.
The existing EventSource counters guard against that (e.g., HttpClient passes a bool telemetryStarted
flag to FinishSend).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Yesterday I went through the new counters added to ASP.NET Core and added protection for that.
A unit test for this scenario is starting to listen to a meter after an HTTP request has started, then assert that current-requests
wasn't decreased by 1.
#pragma warning disable SA1129 // Do not use default value type constructor | ||
var tags = new TagList(); | ||
#pragma warning restore SA1129 // Do not use default value type constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#pragma warning disable SA1129 // Do not use default value type constructor | |
var tags = new TagList(); | |
#pragma warning restore SA1129 // Do not use default value type constructor | |
TagList tags = default; |
Does this work to avoid the warning?
if (requestUri.Scheme is not null) | ||
{ | ||
tags.Add("scheme", requestUri.Scheme); | ||
} | ||
if (requestUri.Host is not null) | ||
{ | ||
tags.Add("host", requestUri.Host); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (requestUri.Scheme is not null) | |
{ | |
tags.Add("scheme", requestUri.Scheme); | |
} | |
if (requestUri.Host is not null) | |
{ | |
tags.Add("host", requestUri.Host); | |
} | |
tags.Add("scheme", requestUri.Scheme); | |
tags.Add("host", requestUri.HostNameType == UriHostNameType.IPv6 ? $"[{requestUri.IdnHost}]" : requestUri.IdnHost); | |
Do you want this to be the IP even if we know the host (the header is present)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. OpenTelemetry has the host tag - called net.peer.name
- in its semantic conventions. We should check with them for recommended behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is your thought around using IdnHost
over Host
? I'm not well versed on the topic but from what I read in the docs the difference is that IdnHost escapes some characters with Punycode. I would expect handlers of Metric data to respect the full unicode character set and the strings are likely to get exposed to humans via dashboards and queries where escape sequences are more awkward so Host
seemed more appropriate. Looking at R9 they've been using Host
as the tag value so we've got some real world validation that Host
satisfies the goal.
Do you want this to be the IP?
I think want the name whenever it has been provided (which I assume is what Host does).
@dpk83 - You may have considered this in the past, any thoughts on this one?
HttpTelemetry.Log.RequestStart(request); | ||
metrics.RequestStart(request); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this logic does not apply to HttpClient
.
If you want to make that work, you need to also instrument StartSend
and FinishSend
in HttpClient.
HttpTelemetry.Log.RequestStart(request); | ||
metrics.RequestStart(request); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make sure these checks exist in HttpTelemetry
as well?
That is, change places like RequestStart/Stop
in HttpTelemetry
to
Interlocked.Increment(ref _startedRequests);
if (IsEnabled(...))
{
RequestStart(
request.RequestUri.Scheme,
request.RequestUri.IdnHost...);
}
The counter fields should continue to be updated, but we don't have to read the Uri properties/serialize the event if the EventSource isn't enabled.
#pragma warning disable SA1129 // Do not use default value type constructor | ||
var tags = new TagList(); | ||
#pragma warning restore SA1129 // Do not use default value type constructor | ||
InitializeCommonTags(ref tags, request); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Anton that relying on the request message's version can be quite misleading.
YARP for example uses a default version of 2.0 with a RequestVersionOrLower
policy.
If we only use the request version, the user would only see 2.0 counters even if they're actually only talking 1.1 to the server.
The unfortunate thing is that we'll only know what version is actually used when we try sending the request to the wire internally, or from a higher-level perspective, once you get back the response message object.
It may make sense to consider moving this logic further down into SocketsHttpHandler
, especially if you will also want to track when the response content has been read completely.
|
||
_requestsDuration = _meter.CreateHistogram<double>( | ||
"request-duration", | ||
unit: "s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If unit is specified here then changing the unit will mean we will have no option but to create a new instrument as the requirement for us is to use ms
. I know currently the unit here is just an indication but we never know how the backend or some exporter might end up using it for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the issue specifically that the unit was specified? For example if the code didn't specify a unit but still captured measurements in seconds would that solve the problem? My understanding is that it would not.
I think emitting the measurement in seconds (and marking the unit that way) is the right thing for .NET to do because it appears to be the overall approach industry at large is standardizing on. However I also get that not all consumers are aligned behind a single standard and in particular we have MS 1P customers who want ms measurements for compat with their existing practice. I believe we need to invest in techniques that let us present different views of the data to different customer segments. A couple ideas how that might happen:
- Creating a new instrument with different naming that proxies a scaled version of the measurements and only exporting data from the new instrument.
- Ignoring this instrument and continuing to collect a separate measurement via a different handler or DiagnosticSource
- Doing work in OTel support changing the units on an existing instrument before it is reported, perhaps as part of a View
- Doing a similar feature as (3) but in the runtime further building on the config and pipeline work
(1) or (2) seem like they might be our shortest-term options but (3) and (4) are probably where I'd hope we end up eventually.
if (response is not null) | ||
{ | ||
tags.Add("status-code", (int)response.StatusCode); // Boxing? | ||
tags.Add("protocol", $"HTTP/{response.Version}"); // Hacky |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid allocation. We wouldn't want our services to pay the allocation cost for something which our library will anyways drop and not emit. Protocol is one of them
} | ||
if (request.HasTags) | ||
{ | ||
foreach (var customTag in request.MetricsTags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question, will the same metrics support be added for net framework or is it limited to net core only?
} | ||
if (request.HasTags) | ||
{ | ||
foreach (var customTag in request.MetricsTags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few points:
-
Quite often caller doesn't have access to the HttpRequestMessage, third party library is a very good example of this scenario. In complex organizations where there are hundreds and thousands of services a common libraries are used quite a lot and this option just doesn't work. We devised IDownstreamDependencyMetadata for that very purpose.
-
We are now supporting both DelegatingHandler based auto collection of metrics as well as DiagnosticsSource based approach because DelegatingHandler wasn't sufficient for a lot of cases.
-
In order to consume this metrics directly emitted from .NET and provide the transformation and the dependency metadata looks like we will still continue to use DelegatingHandler and DiagnosticsSource so I don't see how we can take consume and build onto the metrics this component is going to emit without incurring additional perf overhead.
-
Is this support is going to be ported back for .NET Framework?
} | ||
if (unhandledException is not null) | ||
{ | ||
tags.Add("exception-name", unhandledException.GetType().FullName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dpk83 @JamesNK @noahfalk I've been doing further analysis and prototyping on this, primarily focused on the We have 3 options to implement metrics for HTTP:
I'm providing a comparison of these 3 options amongst the following requirements: (A) Handler/platform independence The 3 potential solutions would also differ in API shape. 1. Implementing metrics in
|
I'm also in favor of option 2 - See https://gist.github.com/JamesNK/9cd342b3a547bc2f7963d8de5c18385c as an example of customizing the response It might be possible to make this more lightweight with access to internal HttpClient APIs. This scenario - caring about the response end - has recently come up with gRPC - grpc/grpc-dotnet#2133 cc @MihaZupan. Maybe an API could be added to For example: public class HttpResponseMessage
{
public CancellationToken ResponseEnded { get; }
}
var httpClient = new HttpMessageInvoker(new SocketsHttpHandler());
var response = await httpClient.SendAsync(new HttpRequestMessage(HttpMethod.Get, "https://www.google.com"), CancellationToken.None);
response.ResponseEnded.Register(() => Console.WriteLine("Response ended"));
var content = response.Content.ReadAsStringAsync();
Console.WriteLine(content); API options:
These would be called when the |
What about the requirement (C) - using response headers for enrichment? Even if we manage to hook response end logging into
Can you open an API proposal for this? Even if it's related implementation-wise, it's a separate feature. |
@antonfirsov - You did a really nice job of investigating and presenting the tradeoffs there 👍. Thanks for the thorough treatment!
My impression is that (C) was a "maybe someone would find it useful if they could do this" but I am not aware we identified any specific case where it was certain to be useful. I give it low weight in our choice. For (E) I don't believe it was ever a requirement in the past and I am fine to keep it that way. Even if it was trivial to implement such that the duration did capture the entire response for HttpCompletionOption.ResponseHeadersRead, I suspect it would not be what most developers preferred to measure. I have no objection to James' proposal that other scenarios might benefit from observing the end of the response stream, I just wouldn't propose using it for this metrics duration measurement scenario. I'm happy with option (2) as well. I am assuming MetricsHandler remains adjacent to DiagnosticsHandler in the handler pipeline, as you currently prototyped it. If it matters I'd be equally happy if the Meter was inside the DiagnosticsHandler from a scenario perspective, but I assume you specifically chose a new handler because it simplified the implementation. |
I assume requirement C is achievable. Why is it unsolvable if there is no response content? I agree that it's more complicated to require both things to be true, and will require changes outside of a
When both this things have happened a callback that's registered by the I'll create an issue for an API that allows you to detect when a Edit: HttpResponseMessage complete event proposal - #86502 |
@cakescience I noticed this as well. Prometheus.net exports metrics by combining the meter and counter names. For example, I created an issue a month ago about it - open-telemetry/opentelemetry-dotnet#4424 - but I don't expect anything to change since it's an agreed-upon OTel standard. If we leave .NET counters with duplicate names (e.g. the server has For example:
|
Regarding #85447 (comment):
Regardless if we decide to define an API or to implement EOF detection internally, my main concern is the same as I described in #86502 (comment): the only way to detect an empty stream in order to signal an EOF immediately is to do it from within the underlying platform handler. This means that an important part of the core logic cannot live in a separate |
Or maybe we can leverage |
Addresses #84978. Work in progress.
Adds:
request-duration
countercurrent-requests
counterHttpMessageInvoker.Meter
propertyHttpRequestMessage.MetricsTags
property