Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] gohai logs error "[Debug] Error fetching info for pid 1" even when log level is >= info, in otel collector datadog exporter #21487

Open
ringerc opened this issue Dec 11, 2023 · 5 comments
Labels
team/opentelemetry OpenTelemetry team

Comments

@ringerc
Copy link

ringerc commented Dec 11, 2023

https://github.com/DataDog/datadog-agent/tree/main/pkg/gohai emits an error like

1702327020251532308 [Debug] Error fetching info for pid 1: user: unknown userid 10001

... when invoked by the OpenTelemetry Collector Datadog Exporter's hostmetrics collector on startup. This is because there is no /etc/passwd or nss service etc in the container, it's a barebones os-less container.

The datadog exporter doesn't seem to use a logging adapter to send logs to the collector's log sink, so this is emitted irrespective of log level. It ignores the collector's configured log format and emits non-json format logs when the collector is configured for json logging. And it's unnecessary, meaningless noise.

The message comes from

log.Debugf("Error fetching info for pid %d: %w", pid, err)

I'm not immediately sure where the returned err is transformed into a log message with the wrong logging adapter, I didn't dig that far.

If this message is necessary at all, it should:

  • not use the word "error"
  • use a logging adapter passed in to gohai from the calling framework, so it goes through the proper logging path

See open-telemetry/opentelemetry-collector-contrib#14186 for details.

Agent Environment

N/A; this is about the OpenTelemetry Collector Datadog Exporter (which is managed by Datadog) running the gohai packages.

Describe what happened:

Annoying log message on every startup at all log levels. This message should NOT be emitted, given that the configured log level of my collector is

service:
  telemetry:
    logs:
      encoding: "json"                                                      
      level: "info"                                                         

Describe what you expected:

Message not emitted at all for >debug levels.

When debug level logs enabled, the log message should be emitted with proper json wrapping, and a caller context to identify where it came from.

Steps to reproduce the issue:

Run the example opentelemetry collector config provided by Datadog using the otel/opentelemetry-collector-contrib:0.90.1 image. Check the logs. Observe the error.

Additional environment details (Operating System, Cloud provider, etc):

N/A, you'll see this in docker or k8s or anywhere really.

@ruben-chainalysis
Copy link

Getting a different log line on otel/opentelemetry-collector-contrib:0.100.1:

[Debug] Error fetching info for pid 1: %!w(*fs.PathError=&{open /etc/passwd 2})

This comes up as the last log line after pods start up, nothing logged after it. Can be quite confusing.

@r0fls
Copy link

r0fls commented Oct 9, 2024

Knowing the cause/fix here would be useful. I'm seeing this as well with version 0.94.0 (quite old, I know... will look to update)

@r0fls
Copy link

r0fls commented Oct 9, 2024

It sounds like this is a red herring though and not an issue with the exporter

@iress-ac
Copy link

iress-ac commented Nov 27, 2024

Still seeing this on version 0.109.0 of the otel/opentelemetry-collector-contrib chart. Like the original issue says it's a confusing log because you don't know where it's coming from and it's in the wrong format

Example log lines from the otel collector pod:

2024-11-27T10:32:50.709Z    info    clientutil/api.go:45    API key validation successful.    {"kind": "exporter", "data_type": "metrics", "name": "datadog"}                                                                                                                        │
1732703578849944173 [Debug] Error fetching info for pid 1: %!w(*fs.PathError=&{open /etc/passwd 2})                                                                                                                                                                                  
2024-11-27T11:02:49.352Z    info    [email protected]/reporter.go:204    Sending host metadata    {"kind": "exporter", "data_type": "traces", "name": "datadog", "host": "ip-xx-xx-x-xxx.region.compute.internal"}  

@jackgopack4
Copy link
Contributor

FWIW I'm seeing this as well in version 0.114.0 running opentelemetry-collector-contrib docker image on underlying platform of Darwin/arm64. If datadog/exporter is enabled in the metrics exporter pipeline, and either prometheusreceiver or hostmetricsreceiver or both are enabled, the issue occurs.
If both prometheus and hostmetrics receivers are disabled, the problem does not occur with this setup. I will take a look at this further when I get a chance. Thanks for reporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team/opentelemetry OpenTelemetry team
Projects
None yet
Development

No branches or pull requests

6 participants