Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APM Processing OTEL Collector Attributes to Labels Rather than Attributes #8074

Closed
knechtionscoding opened this issue May 9, 2022 · 10 comments
Labels

Comments

@knechtionscoding
Copy link

knechtionscoding commented May 9, 2022

APM Server version (apm-server version): 7.17.2

Description of the problem including expected versus actual behavior:

We use open telemetry to collect traces from istio. As part of that, and in lieu of #4468 we'd like to be able upsert the deployment.environment variable in the open telemetry collector. We've attempted both service.environment and deployment.environment. Per the APM Documentation it should be deployment.environment. We've configured an attribute processor as such:

  # Mutate the Data
  processors:
    # If set to null, will be overridden with values based on k8s resource limits
    memory_limiter: null
    attributes:
      actions:
        - key: service.environment
          action: upsert
          value: "dev"
        - key: deployment.environment
          action: upsert
          value: "dev"

However, when the APM server receives it it parses it into labels.service_environment and labels.deployment_environment respectively. This is true even when using a non-string value, which according to otel's maintainers is not allowed

I do see it being set properly on the Open Telemetry collector:

Span #63
    Trace ID       : 32a7bd42c119e7883eac24a9c4527561
    Parent ID      : 461aabc5ff251158
    ID             : d8ffa6453eb48b15
    Name           : foo.bar.svc.cluster.local:80/*
    Kind           : SPAN_KIND_UNSPECIFIED
    Start time     : 2022-05-09 22:00:31.286890085 +0000 UTC
    End time       : 2022-05-09 22:00:31.296932743 +0000 UTC
    Status code    : STATUS_CODE_UNSET
    Status message :
Attributes:
     -> http.url: STRING(http://foo.bar.svc.cluster.local/v1)
     -> istio.canonical_service: STRING(sonic-demo-app-publisher)
     -> zone: STRING(us-east4-b)
     -> downstream_cluster: STRING(-)
     -> istio.mesh_id: STRING(cluster.local)
     -> http.protocol: STRING(HTTP/1.1)
     -> OperationName: STRING(Egress)
     -> response_size: STRING(71)
     -> node_id: STRING(sidecar~10.17.134.74~foo.bar.svc.cluster.local)
     -> guid:x-request-id: STRING(c2df7068-2939-44f8-adb8-4182b296486a)
     -> request_size: STRING(1460)
     -> http.status_code: STRING(200)
     -> istio.canonical_revision: STRING(0.0.1)
     -> peer.address: STRING(10.17.134.74)
     -> response_flags: STRING(-)
     -> http.method: STRING(POST)
     -> user_agent: STRING(Apache-HttpClient/4.5.12 (Java/11.0.13))
     -> istio.namespace: STRING(bar)
     -> upstream_cluster.name: STRING(outbound|80||foo.bar.svc.cluster.local)
     -> upstream_cluster: STRING(outbound|80||foo.bar.svc.cluster.local)
     -> component: STRING(proxy)
     -> opencensus.same_process_as_parent_span: BOOL(false)
     -> service.environment: STRING(dev)
     -> deployment.environment: STRING(dev)
ResourceSpans #1
Resource SchemaURL:
Resource labels:
     -> service.name: STRING(foo.bar
     -> opencensus.starttime: STRING(2022-05-09T14:18:30.616783293Z)
     -> host.name: STRING(foo.bar)
     -> process.pid: INT(32)
     -> telemetry.sdk.version: STRING(0.5.0-dev)
     -> opencensus.exporterversion: STRING(0.5.0-dev)
     -> telemetry.sdk.language: STRING(cpp)

APM should be placing an upserted attribute from the collector properly as though it was coming from the agent.

Screenshot of json document in APM after upsert
Screen Shot 2022-05-09 at 6 13 37 PM

Steps to reproduce:

Please include a minimal but complete recreation of the problem,
including server configuration, agent(s) used, etc. The easier you make it
for us to reproduce it, the more likely that somebody will take the time to
look at it.

  1. Default APM configuration of 7.17.2
  2. Default Otel Collector
  3. Add processor that upserts.

Provide logs (if relevant):

@marclop
Copy link
Contributor

marclop commented May 11, 2022

@knechtionscoding I've been unable to reproduce this using the OTel Go client and sendotlp:

# Checked out the 7.17 branch first, ran "docker-compose up -d; make; ./apm-server -E output.elasticsearch.username=admin -E output.elasticsearch.password=changeme -E apm-server.expvar.enabled=true -e"
$ cd systemtest/cmd/sendotlp
$ export OTEL_RESOURCE_ATTRIBUTES=service.name=test,deployment.environment=dev,service.environment=dev
$ export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8200
$ go run main.go
2022-05-11T13:34:47+08:00	INFO	sendotlp/main.go:123	sending OTLP data to http://localhost:8200
2022-05-11T13:34:47+08:00	INFO	sendotlp/main.go:302	exporting metric	{"metric": {"name": "float64_counter", "kind": "CounterInstrumentKind", "start_time": "2022-05-11T13:34:47+08:00", "end_time": "2022-05-11T13:34:47+08:00", "sum": 1}}
2022-05-11T13:34:47+08:00	INFO	sendotlp/main.go:302	exporting metric	{"metric": {"name": "int64_histogram", "kind": "HistogramInstrumentKind", "start_time": "2022-05-11T13:34:47+08:00", "end_time": "2022-05-11T13:34:47+08:00", "boundaries": [1, 10, 100, 1000, 10000], "counts": [0, 1, 2, 3, 0, 0], "sum": 321, "count": 6}}
2022-05-11T13:34:47+08:00	INFO	sendotlp/main.go:259	exporting span	{"span": {"Name":"child1","SpanContext":{"TraceID":"6b31db508906533a573e66e05066ba7a","SpanID":"5ac163346c57b47a","TraceFlags":"01","TraceState":"","Remote":false},"Parent":{"TraceID":"6b31db508906533a573e66e05066ba7a","SpanID":"d0dd4ad39e7fc9c9","TraceFlags":"01","TraceState":"","Remote":false},"SpanKind":1,"StartTime":"2022-05-11T13:34:47.82593+08:00","EndTime":"2022-05-11T13:34:47.836946625+08:00","Attributes":null,"Events":[{"Name":"an arbitrary event","Attributes":null,"DroppedAttributeCount":0,"Time":"2022-05-11T13:34:47.836944+08:00"}],"Links":null,"Status":{"Code":0,"Description":""},"DroppedAttributes":0,"DroppedEvents":0,"DroppedLinks":0,"ChildSpanCount":0,"Resource":[{"Key":"deployment.environment","Value":{"Type":"STRING","Value":"dev"}},{"Key":"service.environment","Value":{"Type":"STRING","Value":"dev"}},{"Key":"service.name","Value":{"Type":"STRING","Value":"test"}},{"Key":"telemetry.sdk.language","Value":{"Type":"STRING","Value":"go"}},{"Key":"telemetry.sdk.name","Value":{"Type":"STRING","Value":"opentelemetry"}},{"Key":"telemetry.sdk.version","Value":{"Type":"STRING","Value":"1.5.0"}}],"InstrumentationLibrary":{"Name":"sendotlp","Version":"","SchemaURL":""}}}
2022-05-11T13:34:47+08:00	INFO	sendotlp/main.go:259	exporting span	{"span": {"Name":"child2","SpanContext":{"TraceID":"6b31db508906533a573e66e05066ba7a","SpanID":"5580e212d9a17e76","TraceFlags":"01","TraceState":"","Remote":false},"Parent":{"TraceID":"6b31db508906533a573e66e05066ba7a","SpanID":"d0dd4ad39e7fc9c9","TraceFlags":"01","TraceState":"","Remote":false},"SpanKind":1,"StartTime":"2022-05-11T13:34:47.847563+08:00","EndTime":"2022-05-11T13:34:47.858493166+08:00","Attributes":null,"Events":[{"Name":"exception","Attributes":[{"Key":"exception.type","Value":{"Type":"STRING","Value":"*errors.errorString"}},{"Key":"exception.message","Value":{"Type":"STRING","Value":"an exception occurred"}}],"DroppedAttributeCount":0,"Time":"2022-05-11T13:34:47.858492+08:00"}],"Links":null,"Status":{"Code":0,"Description":""},"DroppedAttributes":0,"DroppedEvents":0,"DroppedLinks":0,"ChildSpanCount":0,"Resource":[{"Key":"deployment.environment","Value":{"Type":"STRING","Value":"dev"}},{"Key":"service.environment","Value":{"Type":"STRING","Value":"dev"}},{"Key":"service.name","Value":{"Type":"STRING","Value":"test"}},{"Key":"telemetry.sdk.language","Value":{"Type":"STRING","Value":"go"}},{"Key":"telemetry.sdk.name","Value":{"Type":"STRING","Value":"opentelemetry"}},{"Key":"telemetry.sdk.version","Value":{"Type":"STRING","Value":"1.5.0"}}],"InstrumentationLibrary":{"Name":"sendotlp","Version":"","SchemaURL":""}}}
2022-05-11T13:34:47+08:00	INFO	sendotlp/main.go:259	exporting span	{"span": {"Name":"parent","SpanContext":{"TraceID":"6b31db508906533a573e66e05066ba7a","SpanID":"d0dd4ad39e7fc9c9","TraceFlags":"01","TraceState":"","Remote":false},"Parent":{"TraceID":"00000000000000000000000000000000","SpanID":"0000000000000000","TraceFlags":"00","TraceState":"","Remote":false},"SpanKind":1,"StartTime":"2022-05-11T13:34:47.825919+08:00","EndTime":"2022-05-11T13:34:47.859388667+08:00","Attributes":null,"Events":null,"Links":null,"Status":{"Code":0,"Description":""},"DroppedAttributes":0,"DroppedEvents":0,"DroppedLinks":0,"ChildSpanCount":2,"Resource":[{"Key":"deployment.environment","Value":{"Type":"STRING","Value":"dev"}},{"Key":"service.environment","Value":{"Type":"STRING","Value":"dev"}},{"Key":"service.name","Value":{"Type":"STRING","Value":"test"}},{"Key":"telemetry.sdk.language","Value":{"Type":"STRING","Value":"go"}},{"Key":"telemetry.sdk.name","Value":{"Type":"STRING","Value":"opentelemetry"}},{"Key":"telemetry.sdk.version","Value":{"Type":"STRING","Value":"1.5.0"}}],"InstrumentationLibrary":{"Name":"sendotlp","Version":"","SchemaURL":""}}}

It may be specific to the OTEL Collector Attributes, I haven't tested that yet, but will test that afterwards.

image

image

@marclop
Copy link
Contributor

marclop commented May 11, 2022

I was able to reproduce it using the Open Telemetry Collector with the following config:

collector.yml

receivers:
  otlp:
    protocols:
      grpc:

exporters:
  otlp:
    endpoint: http://localhost:8200
    tls:
      insecure: true

  logging:
    loglevel: debug

  file:
    path: ./8074.json

# Mutate the Data
processors:
  attributes:
    actions:
      - key: service.environment
        action: upsert
        value: dev
      - key: deployment.environment
        action: upsert
        value: dev
  batch:
  

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [attributes, batch]
      exporters: [otlp, logging, file]
    metrics:
      receivers: [otlp]
      processors: [attributes, batch]
      exporters: [otlp, logging, file]
$ curl -sLO https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.50.0/otelcol_0.50.0_darwin_arm64.tar.gz
$ tar -xvzf otelcol_0.50.0_darwin_arm64.tar.gz
$ ./otelcol --config collector.ym

8074.json

{"resourceMetrics":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"unknown_service:main"}},{"key":"telemetry.sdk.language","value":{"stringValue":"go"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.5.0"}}]},"scopeMetrics":[{"scope":{"name":"sendotlp"},"metrics":[{"name":"float64_counter","sum":{"dataPoints":[{"attributes":[{"key":"service.environment","value":{"stringValue":"dev"}},{"key":"deployment.environment","value":{"stringValue":"dev"}}],"startTimeUnixNano":"1652255991901355000","timeUnixNano":"1652255991901379000","asDouble":1}],"aggregationTemporality":"AGGREGATION_TEMPORALITY_CUMULATIVE","isMonotonic":true}},{"name":"int64_histogram","histogram":{"dataPoints":[{"attributes":[{"key":"service.environment","value":{"stringValue":"dev"}},{"key":"deployment.environment","value":{"stringValue":"dev"}}],"startTimeUnixNano":"1652255991901355000","timeUnixNano":"1652255991901379000","count":"6","sum":321,"bucketCounts":["0","1","2","3","0","0"],"explicitBounds":[1,10,100,1000,10000]}],"aggregationTemporality":"AGGREGATION_TEMPORALITY_CUMULATIVE"}}]}],"schemaUrl":"https://opentelemetry.io/schemas/v1.7.0"}]}
{"resourceSpans":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"unknown_service:main"}},{"key":"telemetry.sdk.language","value":{"stringValue":"go"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.5.0"}}]},"scopeSpans":[{"scope":{"name":"sendotlp"},"spans":[{"traceId":"4b1ed5fc81f12713e977692d17efe4e3","spanId":"52ee3a06dc3b4dc5","parentSpanId":"ea781daeafc5aa9b","name":"child1","kind":"SPAN_KIND_INTERNAL","startTimeUnixNano":"1652255991908504000","endTimeUnixNano":"1652255991919529084","attributes":[{"key":"service.environment","value":{"stringValue":"dev"}},{"key":"deployment.environment","value":{"stringValue":"dev"}}],"events":[{"timeUnixNano":"1652255991919527000","name":"an arbitrary event"}],"status":{}}]}],"schemaUrl":"https://opentelemetry.io/schemas/v1.7.0"},{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"unknown_service:main"}},{"key":"telemetry.sdk.language","value":{"stringValue":"go"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.5.0"}}]},"scopeSpans":[{"scope":{"name":"sendotlp"},"spans":[{"traceId":"4b1ed5fc81f12713e977692d17efe4e3","spanId":"919bbbc4390594ae","parentSpanId":"ea781daeafc5aa9b","name":"child2","kind":"SPAN_KIND_INTERNAL","startTimeUnixNano":"1652255991921207000","endTimeUnixNano":"1652255991931282750","attributes":[{"key":"service.environment","value":{"stringValue":"dev"}},{"key":"deployment.environment","value":{"stringValue":"dev"}}],"events":[{"timeUnixNano":"1652255991931281000","name":"exception","attributes":[{"key":"exception.type","value":{"stringValue":"*errors.errorString"}},{"key":"exception.message","value":{"stringValue":"an exception occurred"}}]}],"status":{}}]}],"schemaUrl":"https://opentelemetry.io/schemas/v1.7.0"},{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"unknown_service:main"}},{"key":"telemetry.sdk.language","value":{"stringValue":"go"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.5.0"}}]},"scopeSpans":[{"scope":{"name":"sendotlp"},"spans":[{"traceId":"4b1ed5fc81f12713e977692d17efe4e3","spanId":"ea781daeafc5aa9b","parentSpanId":"","name":"parent","kind":"SPAN_KIND_INTERNAL","startTimeUnixNano":"1652255991908496000","endTimeUnixNano":"1652255991931783708","attributes":[{"key":"service.environment","value":{"stringValue":"dev"}},{"key":"deployment.environment","value":{"stringValue":"dev"}}],"status":{}}]}],"schemaUrl":"https://opentelemetry.io/schemas/v1.7.0"}]}

The problem seems to be that we only decode the known attributes from the metadata, but if they're present in the span, we're not adding them as "known" fields:

resource := resourceSpans.Resource()
translateResourceMetadata(resource, &baseEvent)
if exportTimestamp, ok := exportTimestamp(resource); ok {
timeDelta = receiveTimestamp.Sub(exportTimestamp)
}
instrumentationLibrarySpans := resourceSpans.InstrumentationLibrarySpans()
for i := 0; i < instrumentationLibrarySpans.Len(); i++ {
c.convertInstrumentationLibrarySpans(
instrumentationLibrarySpans.At(i), baseEvent, timeDelta, logger, out,
)
}
.

@axw is this the expected behavior? or should we take the well defined span attributes and map them accordingly as well?

@knechtionscoding
Copy link
Author

@marclop does that mean if we use something like: https://istio.io/latest/docs/tasks/observability/distributed-tracing/mesh-and-proxy-config/ to set a custom_tag that maps to deployment.environment apm would properly map it?

@arnestorksen
Copy link

arnestorksen commented May 11, 2022

@knechtionscoding I did the same thing as you, but I discovered that you may set resource-level attributes using the resource processor.

Using this processor configuration, I got my environment property to show up in Elastic APM.

    processors:
      resource:
        attributes:
          - key: deployment.environment
            action: insert
            value: test

https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourceprocessor

@knechtionscoding
Copy link
Author

@arnestorksen That does work, and I appreciate the help! Tackar! I'd still be curious to know why the resource.attributes processor is different than the attributes processor.

@marclop
Copy link
Contributor

marclop commented May 12, 2022

@knechtionscoding I don't know what the resulting OTLP trace format will look like when using the Istio custom_tags. If they're set at the resource level, then they should be parsed, if not, then they will be treated as labels.

@arnestorksen Thanks for providing the example processor, I did test it locally and the resource attributes were mapped correctly.

@axw
Copy link
Member

axw commented May 16, 2022

@axw is this the expected behavior? or should we take the well defined span attributes and map them accordingly as well?

It is expected. These are attributes that describe the thing (service) that produced the events (traces, metrics, ...); they're not describing the events themselves. See also https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/overview.md#resources

We should probably document this.

@marclop
Copy link
Contributor

marclop commented May 20, 2022

should we repurpose this issue as a doc issue and document the expected behavior?

@axw
Copy link
Member

axw commented May 20, 2022

@marclop I'd rather preserve the description in this issue, so I've created #8173 to follow up on docs. As things are working as expected here, I'll close this issue.

@axw axw closed this as completed May 20, 2022
@nazarmubeen
Copy link

Hi @marclop , i need to send sample custom test metric data to APM server directly. what is the better approach to do it ?

after starting collector i get below :-
image

i am sending data from postman to this host

image

however still not able to visualize the data in elastic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants