Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collector cannot export metrics telemetry in an ipv6-only environment #10011

Closed
lpetrazickisupgrade opened this issue Apr 22, 2024 · 3 comments · Fixed by #10343
Closed

Collector cannot export metrics telemetry in an ipv6-only environment #10011

lpetrazickisupgrade opened this issue Apr 22, 2024 · 3 comments · Fixed by #10343
Labels
bug Something isn't working collector-telemetry healthchecker and other telemetry collection issues

Comments

@lpetrazickisupgrade
Copy link

lpetrazickisupgrade commented Apr 22, 2024

Describe the bug

  1. Expose telemetry metrics for OpenTelemetry Collector with a correctly escaped ipv6 ip address
  2. Collector unescapes the ip address and naively concatenates it with the port number
  3. Too many colons error

Steps to reproduce

  1. Delimit the ipv6 address with square brackets:
service:
  telemetry:
    logs:
      encoding: json
    metrics:
      address: '[${env:MY_POD_IP}]:8888'
  1. Deploy config to an ipv6-only environment
  2. listen tcp: address dead:beef:dead:beef:dead::beef:8888: too many colons in address

What did you expect to see?
Metrics on port 8888

What did you see instead?

{
  "level": "error",
  "ts": 1713554862.2179377,
  "caller": "[email protected]/collector.go:275",
  "msg": "Asynchronous error received, terminating process",
  "error": "listen tcp: address dead:beef:dead:beef:dead::beef:8888: too many colons in address",
  "stacktrace": "
go.opentelemetry.io/collector/otelcol.(*Collector).Run
    go.opentelemetry.io/collector/[email protected]/collector.go:275
go.opentelemetry.io/collector/otelcol.NewCommand.func1
    go.opentelemetry.io/collector/[email protected]/command.go:35
github.com/spf13/cobra.(*Command).execute
    github.com/spf13/[email protected]/command.go:983
github.com/spf13/cobra.(*Command).ExecuteC
    github.com/spf13/[email protected]/command.go:1115
github.com/spf13/cobra.(*Command).Execute
    github.com/spf13/[email protected]/command.go:1039
main.runInteractive
    github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:27
main.run
    github.com/open-telemetry/opentelemetry-collector-releases/contrib/main_others.go:10
main.main
    github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:20
runtime.main
    runtime/proc.go:271"
}

What version did you use?
v0.98.0

What config did you use?

service:
  telemetry:
    logs:
      encoding: json
    metrics:
      address: '[${env:MY_POD_IP}]:8888'

Environment
helm.sh/chart: opentelemetry-collector-0.87.2
Image: opentelemetry-collector-contrib:0.98.0
Kubernetes: v1.29.1-eks-b9c9ed7

Additional context
This is a regression. v0.79.0 did not have this issue

@lpetrazickisupgrade lpetrazickisupgrade added the bug Something isn't working label Apr 22, 2024
@TylerHelmuth
Copy link
Member

@lpetrazickisupgrade I am curious if the issue is with the collector serving the metrics or the prometheus receiver scrapping. Can you reproduce the issue without a prometheus receiver trying to scrape?

@TylerHelmuth TylerHelmuth added the collector-telemetry healthchecker and other telemetry collection issues label Apr 22, 2024
@TylerHelmuth
Copy link
Member

Most likely though this is a bug from switching to using the OTel Go SDK instead of opencensus.

/cc @codeboten

@lpetrazickisupgrade
Copy link
Author

lpetrazickisupgrade commented Apr 22, 2024

@TylerHelmuth Thanks for taking a look! I think the OpenTelemetry Collector process is crashing at startup parsing the config. The pod is in a CrashLoopBackOff. It doesn't get far enough in the startup sequence to respond to network requests. I've included the only log message

I think the regression may have been introduced by this PR: https://github.com/open-telemetry/opentelemetry-collector/pull/9632/files

Because the otlp exporter reuses the grpc client config: https://github.com/open-telemetry/opentelemetry-collector/blame/v0.98.0/exporter/otlpexporter/config.go#L25

dmitryax pushed a commit that referenced this issue Jun 28, 2024
…10343)

#### Description
Fixing the bug: the latest version of otel-collector failed to start
with ipv6 metrics endpoint service telemetry.

This problem began to occur after
#9037 with
the feature gate flag enabled was merged. This problem is probably an
implementation omission because the enabled codepath, which was
originally added by
#7871, is
marked as WIP.

You can reproduce the issue with the config and the environment variable
(`MY_POD_IP=::1`).
```yaml
service:
  telemetry:
    logs:
      encoding: json
    metrics:
      address: '[${env:MY_POD_IP}]:8888'
```

#### Link to tracking issue
Fixes
#10011

---------

Co-authored-by: Tyler Helmuth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working collector-telemetry healthchecker and other telemetry collection issues
Projects
None yet
2 participants