Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timestamp to metrics, implement Azure Monitor Metrics Exporter and Console Metrics Exporter #186

Closed
wants to merge 1 commit into from

Conversation

lzchen
Copy link
Contributor

@lzchen lzchen commented Sep 28, 2019

Implementation of Metrics Exporter for Azure Monitor. Currently there is no concept of metrics processor or a polling mechanism, so export is called directly with the exporter.
Most of the code was copied from OpenCensus, with some style adjustments.

As well, implementation of Console Metrics Exporter, in which takes the metrics and prints them to the console.

# Metric tuples is a sequence of metric to label values pairs
envelopes = tuple(map(self.metric_tuple_to_envelope, metric_tuples))

try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@lzchen lzchen Sep 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the transport logic to a separate class TransportMixin and validation of the ikey into utils.py. More sophisticated validation of the ikey can be added there once we implement that.

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll approve if this is something we want delivered by alpha, but I think there's some valuable feedback to action on.

pass

if response.status_code == 200:
logger.info("Transmission succeeded: %s.", text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of people use basicConfig, which would print logs for things like this.

Thoughts about either turning off explicit propagation of this logger, or moving this to debug? Might get noisy by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. I like basicConfig but once we start implement logging using log handlers this might disrupt that process (does nothing if there are already handles configured).

What does it mean by "explicit propagation"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means doing

logging.getLogger('opentelemetry.ext.azure_monitor.transport').propagate = False

so the opentelemetry (or opentelemetry...) logger doesn't handle messages from this logger.

Muting the whole logger sounds like overkill, this seems like a clear case for a debug log.

FAILED_NOT_RETRYABLE = 2


class MetricsExporter:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally not clear on the strategy to put exporters in the SDK. This effectively means that probably everyone who uses opentelemetry-python will consume both the API and SDK, as I doubt vendors will go write their own.

But if there's a use case I'm not considering, I'd love an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the case where vendors use their own SDK but want to use one of our exporters? In that case they would have to have a dependency on our SDK (which is wrong). Maybe we need the common exporter behavior in the API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When vendors implement their own SDK chances are they hard-code a vendor-specific "exporter" equivalent too (though the vendor SDK might not have the concept of an exporter, as there may be only one vendor-specific data format emitted by it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some conversation in open-telemetry/opentelemetry-specification#186, but no real justification for keeping exporters out of the API altogether.

As I understand it, the reasoning is that applications should take a dependency on some SDK, and only applications need to deal with exporting spans/metrics. Libraries should only depend on the API. They can generate spans/metrics, but should leave it up to the application to export them.

If the purpose of the API is to give libraries a lightweight way to generate spans/metrics then there's no reason to include an export API. If it's to give implementers (read: vendors) a set of common interfaces and data types then there's a good reason to include it.

) -> "MetricsExportResult":
for metric_tuple in metric_tuples:
handle = metric_tuple[0].get_handle(metric_tuple[1])
print(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we have this be emitter to a logger vs printed to stdout? I'm thinking it would be nice to redirect this stream to something else (e.g. some streaming service), and loghandlers are a great way to redirect that stream.

conversely I could imagine maybe passing a file object? and then write to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the main idea of the console exporter is for simple debugging purposes, so simply printing out to the console could be enough. Although there may be cases where the outputs get overloaded, in which case redirecting to a file would allow easiness for debugging. I think we can revisit this in a separate issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we talked about this and decided it made sense to have separate console and logging exporters so users would have an impossible-to-screw-up way to test that telemetry is working. We considered implementing the console logger as a special case of the logging exporter but decided against it because it would be possible to modify the logger (or handler directing logs to stdout) via config.

@toumorokoshi
Copy link
Member

Would it be valuable to add some unit tests? specifically around consolexporter at least.

@toumorokoshi
Copy link
Member

@lzchen sorry to put an unrelated question in the pr, but: is there any updates on metrics processors? really curious how we're gonna hook this all up in practice.

@lzchen
Copy link
Contributor Author

lzchen commented Sep 28, 2019

Would it be valuable to add some unit tests? specifically around consolexporter at least.

Yes most definitely. I'll open an issue to track this. [#187]

@lzchen
Copy link
Contributor Author

lzchen commented Sep 28, 2019

@toumorokoshi
Thanks for reviewing. I'll be opening another issue to describe the plans for how we will be hooking up metrics in the pipeline.

Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes here as IMHO the PR does too many things that do not belong under the title. It should be split up.

Feel free to dismiss my review though if the PR would otherwise make it into the release today.

def update(self, value: metrics_api.ValueT) -> None:
if self._validate_update(value):
if self.monotonic and value < 0:
logger.warning("Monotonic counter cannot descend.")
return
self.data += value
self.timestamp = time_ns()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is really required, I'd name it "last_update_timestamp" and also take the timestamp as early as possible, as it should probably represent the timestamp the value is from not the timestamp it was recorded in the metric (although maybe it should, I don't know 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea. I'll rename the field.

FAILED_NOT_RETRYABLE = 2


class MetricsExporter:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When vendors implement their own SDK chances are they hard-code a vendor-specific "exporter" equivalent too (though the vendor SDK might not have the concept of an exporter, as there may be only one vendor-specific data format emitted by it).

"""


class ConsoleMetricsExporter(MetricsExporter):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class also does not seem to belong under this PR title.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally I would split this up into multiple PRs. However, for the sake of the Alpha, I will update the PR title and add more of a description to include both changes.

@lzchen lzchen changed the title Implement Azure Monitor Metrics Exporter Implement Azure Monitor Metrics Exporter and Console Metrics Exporter Sep 30, 2019
@Oberon00 Oberon00 changed the title Implement Azure Monitor Metrics Exporter and Console Metrics Exporter Add timestamp to metrics, implement Azure Monitor Metrics Exporter and Console Metrics Exporter Sep 30, 2019
@Oberon00 Oberon00 dismissed their stale review September 30, 2019 18:10

Title updated, OK for alpha (didn't review most of the code though)

Copy link
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, but I agree that that log line should be debug-level.

Copy link
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and it looks like you've covered all of @Oberon00's comments. Thanks!

) -> "MetricsExportResult":
for metric_tuple in metric_tuples:
handle = metric_tuple[0].get_handle(metric_tuple[1])
print(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we talked about this and decided it made sense to have separate console and logging exporters so users would have an impossible-to-screw-up way to test that telemetry is working. We considered implementing the console logger as a special case of the logging exporter but decided against it because it would be possible to modify the logger (or handler directing logs to stdout) via config.



console_exporter.export([(counter, label_values), (gauge, label_values2)])
exporter.export([(counter, label_values), (gauge, label_values2)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is what we want. I guess this is not what we want to user to use in order to export metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we will use exporter.export explicitly to export metrics. With that being said, this is in the examples and exposed to the users, so it might be confusing in the future. However, this is the only way currently to export metrics. If we do not include it in the examples, then creating a metrics exporter would have been pointless, as users would not know how to use it (even if improperly).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we probably don't want to release this piece?

Copy link
Contributor Author

@lzchen lzchen Sep 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. In that case, I will add a TODO or the rest of the export pipeline and remove the example for now until we decide on implementation details. These should be decided very soon. As for this PR, since it is not crucial for the Alpha release anymore, I will split this into separate PRs, Azure Metrics Exporter and Console Metrics Exporter.

try:
text = response.text
except Exception: # noqa pylint: disable=broad-except
logger.exception("Error while reading response body %s.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format string is wrong.
Why do we want logger.exception here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it not make sense to log at an exception level when an exception is raised?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. We choose log level based on what we expect the users to see.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, is it because we do not expect users to configure their logging levels to expose exception level logs?

@@ -31,6 +32,7 @@ def __init__(
self.value_type = value_type
self.enabled = enabled
self.monotonic = monotonic
self.last_update_timestamp = ns_to_iso_str(time_ns())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we want to use nanoseconds since epoch as the time format across the SDK, and do the conversion in the exporter?

)
except requests.RequestException:
logger.exception("Transient client side error.")
return self.export_result_type.FAILED_RETRYABLE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a dangerous approach given the result_type is coming from two different places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. SUCCESS, FAILED_RETRYABLE and FAILED_NOT_RETRYABLE seem to be generic enough results to be shared across traces and metrics. I will consolidate them into one result type in the SDK.

@lzchen
Copy link
Contributor Author

lzchen commented Oct 1, 2019

[WIP] Wait for export pipeline to be spec'd out.

@lzchen
Copy link
Contributor Author

lzchen commented Oct 3, 2019

Follows SDK spec

@lzchen lzchen closed this Oct 28, 2019
srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
* chore: add Pull request template

* fix: review comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP Work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants