-
Notifications
You must be signed in to change notification settings - Fork 666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timestamp to metrics, implement Azure Monitor Metrics Exporter and Console Metrics Exporter #186
Conversation
# Metric tuples is a sequence of metric to label values pairs | ||
envelopes = tuple(map(self.metric_tuple_to_envelope, metric_tuples)) | ||
|
||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This piece of code is >95% duplicated with
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the transport logic to a separate class TransportMixin
and validation of the ikey into utils.py
. More sophisticated validation of the ikey can be added there once we implement that.
opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py
Outdated
Show resolved
Hide resolved
opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py
Outdated
Show resolved
Hide resolved
opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll approve if this is something we want delivered by alpha, but I think there's some valuable feedback to action on.
pass | ||
|
||
if response.status_code == 200: | ||
logger.info("Transmission succeeded: %s.", text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of people use basicConfig, which would print logs for things like this.
Thoughts about either turning off explicit propagation of this logger, or moving this to debug? Might get noisy by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. I like basicConfig
but once we start implement logging
using log handlers this might disrupt that process (does nothing if there are already handles configured).
What does it mean by "explicit propagation"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means doing
logging.getLogger('opentelemetry.ext.azure_monitor.transport').propagate = False
so the opentelemetry
(or opentelemetry...
) logger doesn't handle messages from this logger.
Muting the whole logger sounds like overkill, this seems like a clear case for a debug log.
ext/opentelemetry-ext-azure-monitor/src/opentelemetry/ext/azure_monitor/metrics.py
Outdated
Show resolved
Hide resolved
FAILED_NOT_RETRYABLE = 2 | ||
|
||
|
||
class MetricsExporter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm generally not clear on the strategy to put exporters in the SDK. This effectively means that probably everyone who uses opentelemetry-python will consume both the API and SDK, as I doubt vendors will go write their own.
But if there's a use case I'm not considering, I'd love an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to the case where vendors use their own SDK but want to use one of our exporters? In that case they would have to have a dependency on our SDK (which is wrong). Maybe we need the common exporter behavior in the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When vendors implement their own SDK chances are they hard-code a vendor-specific "exporter" equivalent too (though the vendor SDK might not have the concept of an exporter, as there may be only one vendor-specific data format emitted by it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some conversation in open-telemetry/opentelemetry-specification#186, but no real justification for keeping exporters out of the API altogether.
As I understand it, the reasoning is that applications should take a dependency on some SDK, and only applications need to deal with exporting spans/metrics. Libraries should only depend on the API. They can generate spans/metrics, but should leave it up to the application to export them.
If the purpose of the API is to give libraries a lightweight way to generate spans/metrics then there's no reason to include an export API. If it's to give implementers (read: vendors) a set of common interfaces and data types then there's a good reason to include it.
opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py
Outdated
Show resolved
Hide resolved
) -> "MetricsExportResult": | ||
for metric_tuple in metric_tuples: | ||
handle = metric_tuple[0].get_handle(metric_tuple[1]) | ||
print( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we have this be emitter to a logger vs printed to stdout? I'm thinking it would be nice to redirect this stream to something else (e.g. some streaming service), and loghandlers are a great way to redirect that stream.
conversely I could imagine maybe passing a file object? and then write to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the main idea of the console exporter is for simple debugging purposes, so simply printing out to the console could be enough. Although there may be cases where the outputs get overloaded, in which case redirecting to a file would allow easiness for debugging. I think we can revisit this in a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we talked about this and decided it made sense to have separate console and logging exporters so users would have an impossible-to-screw-up way to test that telemetry is working. We considered implementing the console logger as a special case of the logging exporter but decided against it because it would be possible to modify the logger (or handler directing logs to stdout) via config.
Would it be valuable to add some unit tests? specifically around consolexporter at least. |
@lzchen sorry to put an unrelated question in the pr, but: is there any updates on metrics processors? really curious how we're gonna hook this all up in practice. |
Yes most definitely. I'll open an issue to track this. [#187] |
@toumorokoshi |
ext/opentelemetry-ext-azure-monitor/src/opentelemetry/ext/azure_monitor/transport.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes here as IMHO the PR does too many things that do not belong under the title. It should be split up.
Feel free to dismiss my review though if the PR would otherwise make it into the release today.
def update(self, value: metrics_api.ValueT) -> None: | ||
if self._validate_update(value): | ||
if self.monotonic and value < 0: | ||
logger.warning("Monotonic counter cannot descend.") | ||
return | ||
self.data += value | ||
self.timestamp = time_ns() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is really required, I'd name it "last_update_timestamp" and also take the timestamp as early as possible, as it should probably represent the timestamp the value is from not the timestamp it was recorded in the metric (although maybe it should, I don't know 😄)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea. I'll rename the field.
FAILED_NOT_RETRYABLE = 2 | ||
|
||
|
||
class MetricsExporter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When vendors implement their own SDK chances are they hard-code a vendor-specific "exporter" equivalent too (though the vendor SDK might not have the concept of an exporter, as there may be only one vendor-specific data format emitted by it).
opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py
Outdated
Show resolved
Hide resolved
""" | ||
|
||
|
||
class ConsoleMetricsExporter(MetricsExporter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class also does not seem to belong under this PR title.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally I would split this up into multiple PRs. However, for the sake of the Alpha, I will update the PR title and add more of a description to include both changes.
Title updated, OK for alpha (didn't review most of the code though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, but I agree that that log line should be debug-level.
ext/opentelemetry-ext-azure-monitor/src/opentelemetry/ext/azure_monitor/transport.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and it looks like you've covered all of @Oberon00's comments. Thanks!
opentelemetry-sdk/src/opentelemetry/sdk/metrics/export/__init__.py
Outdated
Show resolved
Hide resolved
) -> "MetricsExportResult": | ||
for metric_tuple in metric_tuples: | ||
handle = metric_tuple[0].get_handle(metric_tuple[1]) | ||
print( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we talked about this and decided it made sense to have separate console and logging exporters so users would have an impossible-to-screw-up way to test that telemetry is working. We considered implementing the console logger as a special case of the logging exporter but decided against it because it would be possible to modify the logger (or handler directing logs to stdout) via config.
|
||
|
||
console_exporter.export([(counter, label_values), (gauge, label_values2)]) | ||
exporter.export([(counter, label_values), (gauge, label_values2)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is what we want. I guess this is not what we want to user to use in order to export metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we will use exporter.export
explicitly to export metrics. With that being said, this is in the examples and exposed to the users, so it might be confusing in the future. However, this is the only way currently to export metrics. If we do not include it in the examples, then creating a metrics exporter would have been pointless, as users would not know how to use it (even if improperly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we probably don't want to release this piece?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. In that case, I will add a TODO or the rest of the export pipeline and remove the example for now until we decide on implementation details. These should be decided very soon. As for this PR, since it is not crucial for the Alpha release anymore, I will split this into separate PRs, Azure Metrics Exporter and Console Metrics Exporter.
ext/opentelemetry-ext-azure-monitor/src/opentelemetry/ext/azure_monitor/metrics.py
Outdated
Show resolved
Hide resolved
ext/opentelemetry-ext-azure-monitor/src/opentelemetry/ext/azure_monitor/metrics.py
Outdated
Show resolved
Hide resolved
try: | ||
text = response.text | ||
except Exception: # noqa pylint: disable=broad-except | ||
logger.exception("Error while reading response body %s.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format string is wrong.
Why do we want logger.exception
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it not make sense to log at an exception
level when an exception is raised?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. We choose log level based on what we expect the users to see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, is it because we do not expect users to configure their logging levels to expose exception
level logs?
@@ -31,6 +32,7 @@ def __init__( | |||
self.value_type = value_type | |||
self.enabled = enabled | |||
self.monotonic = monotonic | |||
self.last_update_timestamp = ns_to_iso_str(time_ns()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we want to use nanoseconds since epoch as the time format across the SDK, and do the conversion in the exporter?
) | ||
except requests.RequestException: | ||
logger.exception("Transient client side error.") | ||
return self.export_result_type.FAILED_RETRYABLE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a dangerous approach given the result_type is coming from two different places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. SUCCESS
, FAILED_RETRYABLE
and FAILED_NOT_RETRYABLE
seem to be generic enough results to be shared across traces and metrics. I will consolidate them into one result type in the SDK.
[WIP] Wait for export pipeline to be spec'd out. |
Follows SDK spec |
* chore: add Pull request template * fix: review comment
Implementation of Metrics Exporter for Azure Monitor. Currently there is no concept of metrics processor or a polling mechanism, so export is called directly with the exporter.
Most of the code was copied from OpenCensus, with some style adjustments.
As well, implementation of Console Metrics Exporter, in which takes the metrics and prints them to the console.