-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Metrics Label Processing Solutions #1892
Comments
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/master/processor/metricstransformprocessor has the ability to add, remove, update metric labels. Would that work? |
As mentioned in the post, metricstransformprocessor was considered however it does not have the ability to add labels to a batch of metrics, which is our desired functionality. It requires individual metrics to be specified. |
One other option: Maybe we could add metrics support to attributesprocessor under the assumption that attributes and labels will eventually be unified in OT spec. In the meantime maybe the attributesprocessor can handle both with the same or similar configuration semantics. I don't strongly feel that it must be done this way. I think the more conservative approach is probably to do the dedicated processor and when the spec is settled then labelsprocessor can be deprecated and attributes can be the "grand unified" one. |
Hi @tigrannajaryan can you please assign this issue to me @alolita and @amanbrar1999 ty! We plan to add PRs for this code. |
@amanbrar1999 @alolita looking at
I don't think so. It says it is not intended for aggregation. What you need is not aggregation and I would not call it batching either. It is bulk addition labels to all metrics regardless of the name. This looks like a subset of what Would this work? |
@tigrannajaryan This was indeed one of the proposed solutions, however it looks like I misunderstood the desired scope of |
@amanbrar1999 I think it is fine to add bulk updates to |
Yes what I would actually like to do is update the metrics transform processor to support filter sets for bulk updating so that we can do transforms like the following: Also blocked by related proposal: That would then support this use case as well. I was intending to start working on this soon, but if @alolita and @amanbrar1999 want to take that over I'd be more than happy to review instead :) |
My understanding from reading these linked issues is that we want to (1) update the filtering configuration so that it is less verbose, (2) make those configurations external, and then (3) embed filtering into I am trying to understand the scope of the first change, if that requires changes in the internal filtering libraries (https://github.com/open-telemetry/opentelemetry-collector/tree/master/internal/processor) then it would be a breaking change for everything that imports those files right? |
Upon further inspection I think (1) is another issue on its own, and I can instead proceed to try to make |
The context for #1081 is that no-one is particularly happy with the current structure of the I think the proposal in #1081 (comment) would be relatively straightforward to implement, but it would be good to get sign off from @bogdandrutu and/or @tigrannajaryan first, as it requires breaking changes to the config as well as the code. |
@james-bebbington #1081 (comment) proposes breaking changes to the configuration. I would not want to do that and the barrier to such changes is quite high. I would like to look carefully into that and understand if we really want it. In the meantime I think there is a much simpler solution to this particular issue which does not require any breaking changes and simply requires a small change to the metricstransformprocessor to allow bulk updates. I would prefer that we do not make this change dependent on the more complicated decision on #1081 |
Update from the most recent Collector SIG meeting: Solution 3 is likely the best option to proceed forward in this issue due to how labels are handled in the Prometheus Remote Write Exporter. The Having talked with @jmacd we believe resource attributes can be combined with the idea of "external labels" in Prometheus, which is what Cortex expects us to use in order to add these labels. Furthermore according to @tigrannajaryan the conversion of resource attributes to labels is an expected functionality of any exporter and is currently missing in the Prometheus Remote Write Exporter, hence information is being lost when metrics are exported. https://github.com/open-telemetry/opentelemetry-collector/blob/master/exporter/prometheusremotewriteexporter/exporter.go#L96 This line in the code stands as even further proof that this is intended functionality. I have made a post in the gitter channel about adding functionality in Prometheus Remote Write Exporter to convert resource attributes to labels to get feedback from users of this exporter and ensure this does not break anything for them, and barring any objections I will be proceeding with adding this functionality to the Prometheus Remote Write Exporter. |
@jamcd and @bogdandrutu what do you think? We also have the exact same issue when exporting from Otel metric SDKs to Prometheus, so the solution preferably needs to be the same in SDKs and in the Collector. |
Another update after more discussions from Metrics SIG: A generic component is being designed for conversion of resource attributes to labels in exporters: open-telemetry/opentelemetry-collector-contrib#1359 This still does not solve my use case, the core issue here is that Prometheus Remote Write Exporter takes labels starting with "__" and appends "key" infront in order to "preserve" the label. This is not compatible with Cortex because Cortex expects users to be able to add "__replica__" label, which is doable in a regular Prometheus setup using "external labels". With this component resource attributes will become labels and get sanitized, which will modify "__replica__" I was working on a custom resource attribute to label conversion within Prometheus Remote Write Exporter however it appears this is no longer a desired solution because of the component mentioned above. Discussion with @jmacd has resulted in 2 possible solutions: Solution 1 here would be blocked by the generic component mentioned above, as this issue still requires the ability to add labels. That being said, I am now proceeding with Solution 2, which is to allow label addition within the Prometheus Remote Write Exporter that does not have the restriction of "__" that other labels have. These labels would be functionally equivalent to Prometheus "external labels" |
Solution 2 above sounds good to me, as a dead-simple solution to the problem at hand. All the other processor discussions above look like more-complicated ways of accomplishing this very simple thing needed for Prometheus remote-write. |
@jmacd we're implementing -- Solution 2. have these double-underscore labels injected through exporter configuration, after all the other pipeline stages that may drop double-underscores are evaluated @amanbrar1999 will file a PR shortly. |
External labels config has been added to prometheus remote write exporter and hence this issue is now resolved |
* Update deb/rpm signing * Exclude release scripts from github workflow
Problem
We are looking to add the ability to attach data point labels that are specified in the Collector config file to all metrics that pass through the Collectors. This functionality does not exist at present. We request feedback from maintainers regarding the proposed solutions described below.
Use Case
Our need for this functionality stems from Cortex. We wish to have multiple collectors running concurrently and feeding data to Cortex for reliability, so that if any collector instance fails for any reason, there are others still collecting metrics and resulting in no data loss. Quite clearly this also means that many duplicate metrics will be exported to Cortex, however Cortex has a method for deduplication using “HA labels”. What this requires is for each metric to have 2 labels: “cluster” and “__replica__”. The most important fact is that each individual metric source (in this case, each collector instance) should have a different “__replica__” value. This is doable because the Collector config file supports use of environment variables, hence a unique value such as a pod name can be used. Cortex will choose which “__replica__” to collect metrics from within each cluster, hence the issue of duplicate metrics is solved.
More information can be found here: https://cortexmetrics.io/docs/production/ha-pair-handling/
To be more specific, in OTLP format if a metric looks like this:
We want it to look like this after modifications:
Note that the difference between the two metrics is highlighted.
Potential Solutions
(Recommended) Solution 1: New ‘labelprocessor’
A possible solution is a new processor which reads labels specified in the Collector config file and attaches them to the data point labels of all metrics
Pros:
Cons:
Solution 2: Modify ‘resourceprocessor’ to allow changes to data point labels
The ‘resourceprocessor’ performs a similar transformation, however the key idea to note here is that within OTLP metrics, resource attributes are not the same as data point labels. ‘resourceprocessor’ can add resource attributes, however the desired functionality is the ability to add data point labels.
Here is an example metric from logging exporter that shows the difference:
Pros:
Cons:
Solution 3: Add functionality to Prometheus Remote Write Exporter
What this solution entails is (1) using the ‘resourceprocessor’ to add our labels as resource attributes, and (2) ‘Prometheus remote write exporter’ taking these resource attributes and converting them to valid Prometheus labels.
The functionality for step (2) of this solution does not exist, so a possible solution is to add that.
Pros:
Cons:
Solution 4: Add batch label editing to metricstransformprocessor
metricstransformprocessor is a processor in the opentelemetry-collector-contrib repo that almost does what we want, however it can only add labels to metrics that are specified by name, whereas we wish to add labels to all metrics. Hence a possible solution is to add the ability to make transformations to all metrics in this processor.
Pros:
Cons:
Additional Context
Prometheus Receiver
The Prometheus receiver has the functionality to add labels, however this does not suit our use case because in the scraping process Prometheus removes all labels starting with “__”. This means that the “__replica__” label cannot be set here.
Conclusion
The recommended solution for this issue is to make a new ‘labelprocessor’ that can be used to add labels to any metrics that are passing through the collector. Although our scope is currently limited to Prometheus and Cortex, this processor can be used for a more general purpose. We request approval from the Collector maintainers to proceed with our recommended solution, and we are open to discussion on other potential solutions.
cc - @bogdandrutu , @tigrannajaryan , @pjanotti , @alolita
The text was updated successfully, but these errors were encountered: