Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document collector's internal telemetry #10695

Merged
66 changes: 66 additions & 0 deletions docs/rfcs/internal-telemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Defining guidelines for internal telemetry

## Overview

The Collector supports generating internal telemetry
that assists end users when operating the Collector. So
far, much of the telemetry has been added to components as
needed, without guidelines for component authors to provide
a consistent experience to end users. The goal of this document
is to:

- describe the naming and attributes currently in use
- define consistent units + naming for telemetry emitted by the Collector
- define the process that should be used to configure new metrics

djaglowski marked this conversation as resolved.
Show resolved Hide resolved
## Out of scope

This document is not intending to dictate when telemetry should be
emitted by various Collector components. Considering the various types
of components, this will be better discussed in a future document.

This document is not intending to provide a comprehensive plan for how
the Collector or the health of telemetry pipelines should be monitored. There
is an [OpenTelemetry Enhancement Proposal](https://github.com/open-telemetry/oteps/pull/259) that has already started the process
to provide this information. The information in this document will be updated once
the OTEP lands to follow its recommendations.
djaglowski marked this conversation as resolved.
Show resolved Hide resolved

## Internal telemetry properties
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get away with adding everything from this point on to the observability.md and call it a day? This accurately describes our existing telemetry expectations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved, please take a look

djaglowski marked this conversation as resolved.
Show resolved Hide resolved

Telemetry produced by the Collector have the following properties:

- metrics produced by Collector components use the prefix `otelcol_`. Until v0.106.0, this
was done via the Prometheus namespace, which made metrics inconsistent for users wishing
to emit metrics via another exporter. See https://github.com/open-telemetry/opentelemetry-collector/issues/9315
for more details
codeboten marked this conversation as resolved.
Show resolved Hide resolved
- code is instrumented using the OpenTelemetry API and telemetry is produced via the OpenTelemetry Go SDK
- instrumentation scope is defined via configuration in `metadata.yaml`
mx-psi marked this conversation as resolved.
Show resolved Hide resolved
- metrics are defined via `metadata.yaml` except in components that have specific cases where
it is not possible to do so. See the [issue](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33523)
codeboten marked this conversation as resolved.
Show resolved Hide resolved
which list such components

## Units

The following units should be used for metrics emitted by the Collector
for the purpose of its internal telemetry:

| Field type | Unit |
codeboten marked this conversation as resolved.
Show resolved Hide resolved
| ---------------------------------------------------------------- | -------------- |
| Metric about receiving, processing, exporting log records | `{records}` |
codeboten marked this conversation as resolved.
Show resolved Hide resolved
| Metric about receiving, processing, exporting spans | `{spans}` |
| Metric about receiving, processing, exporting metric data points | `{datapoints}` |

## Process for defining new metrics

Metrics in the Collector are defined via `metadata.yaml`, which is used by [mdatagen](https://github.com/open-telemetry/opentelemetry-collector/tree/main/cmd/mdatagen) to
produce:

- code to create metric instruments that can be used by components
- documentation for internal metrics
- a consistent prefix for all internal metrics
- convenience accessors for meter and tracer
- a consistent instrumentation scope for components
- test methods for validating the telemetry

The process to generate new metrics is to configure them via
`metadata.yaml`, and run `go generate` on the component.
Loading