Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Otel Contrib Error: Error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding: #31290

Closed
bhupenbisht opened this issue Feb 16, 2024 · 10 comments

Comments

@bhupenbisht
Copy link

Component(s)

cmd/otelcontribcol

Describe the issue you're reporting

i am getting below error while running the otel contrib on windows and linux servers. i have made some changes to collect process details along with cpu and memory. below are the error and config details.

Error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

  • error decoding 'receivers': error reading configuration for "hostmetrics": error reading settings for scraper type "process": 1 error(s) decoding:

  • 'metrics' expected a map, got 'slice'
    2024/02/16 10:37:46 collector server run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

  • error decoding 'receivers': error reading configuration for "hostmetrics": error reading settings for scraper type "process": 1 error(s) decoding:

  • 'metrics' expected a map, got 'slice'
    ============================config yml===========================

exporters:
otlphttp:
endpoint: "https://xxxxxxx"

logging:
loglevel: info

receivers:
hostmetrics:
collection_interval: 10s
scrapers:
memory: {}
cpu: {}
load: {}
disk: {}
filesystem: {}
network: {}
paging: {}
processes:
metrics:
system.processes.created:
enabled: true
system.processes.count:
enabled: true
process:
metrics:
- name: process.cpu.time
resource_attributes:
process.executable.name: "{{ .name }}"
- name: process.memory.usage
resource_attributes:
process.executable.name: "{{ .name }}"

processors:
resourcedetection:
detectors:
- env
- system

attributes:
actions:
- key: host_id
value: "{COMPUTERNAME}"
action: insert

batch:
send_batch_size: 8192
timeout: 200ms

service:
pipelines:
metrics:
receivers:
- hostmetrics
processors:
- attributes
- resourcedetection
- batch
exporters:
- logging
- otlphttp

@bhupenbisht bhupenbisht added the needs triage New item requiring triage label Feb 16, 2024
@andrzej-stencel
Copy link
Member

Hey @bhupenbisht, you've pasted in your configuration without formatting, so I can only guess that the configuration you wanted was something like this:

exporters:
  otlphttp:
    endpoint: "https://xxxxxxx"

  logging:
    loglevel: info

receivers:
  hostmetrics:
    collection_interval: 10s
    scrapers:
      memory: {}
      cpu: {}
      load: {}
      disk: {}
      filesystem: {}
      network: {}
      paging: {}
      processes:
        metrics:
          system.processes.created:
            enabled: true
          system.processes.count:
            enabled: true
      process:
        metrics:
        - name: process.cpu.time
          resource_attributes:
            process.executable.name: "{{ .name }}"
        - name: process.memory.usage
          resource_attributes:
            process.executable.name: "{{ .name }}"

processors:
  resourcedetection:
    detectors:
    - env
    - system

  attributes:
    actions:
    - key: host_id
      value: "{COMPUTERNAME}"
      action: insert

  batch:
    send_batch_size: 8192
    timeout: 200ms

service:
  pipelines:
    metrics:
      receivers:
      - hostmetrics
      processors:
      - attributes
      - resourcedetection
      - batch
      exporters:
      - logging
      - otlphttp

Let me know if this is what you were trying to run.

The configuration under receivers.hostmterics.scrapers.process.metrics is incorrect. As you've correctly specified in the receivers.hostmterics.scrapers.processes config, the metrics property is a map, not a slice.

The other thing is the contents of the resource_attributes entry - process.executable.name: {{ .name }}. I'm not sure what you are trying to achieve. Do you want to only get metrics about a process that has a specific executable name? You could try to use Process scraper's include property for this.

Here's a proposed working configuration, feel free to adjust it to your needs or ask further questions.

exporters:
  otlphttp:
    endpoint: "https://xxxxxxx"

  logging:
    loglevel: info

receivers:
  hostmetrics:
    collection_interval: 10s
    scrapers:
      memory: {}
      cpu: {}
      load: {}
      disk: {}
      filesystem: {}
      network: {}
      paging: {}
      processes:
        metrics:
          system.processes.created:
            enabled: true
          system.processes.count:
            enabled: true
      process:
        include:
          names:
          - {{ .name }}
          match_type: regexp # or "strict"

processors:
  resourcedetection:
    detectors:
    - env
    - system

  attributes:
    actions:
    - key: host_id
      value: "{COMPUTERNAME}"
      action: insert

  batch:
    send_batch_size: 8192
    timeout: 200ms

service:
  pipelines:
    metrics:
      receivers:
      - hostmetrics
      processors:
      - attributes
      - resourcedetection
      - batch
      exporters:
      - logging
      - otlphttp

@andrzej-stencel andrzej-stencel removed the needs triage New item requiring triage label Feb 16, 2024
@bhupenbisht
Copy link
Author

Hi @astencel-sumo . thanks a lot for your response.

my requirement: trying to get process name details along with cpu and memory uses.

As per your suggestion i have make the changes, but getting below error.

D:\otel>otelcol-contrib.exe --config config.yaml
Error: failed to resolve config: cannot resolve the configuration: cannot retrieve the configuration: yaml: invalid map key: map[string]interface {}{".name":interface {}(nil)}
2024/02/16 18:47:56 collector server run finished with error: failed to resolve config: cannot resolve the configuration: cannot retrieve the configuration: yaml: invalid map key: map[string]interface {}{".name":interface {}(nil)}

if i re-modify it as below.. than its not scraping process metrics for server.

process:
include:
names:
- process.executable.name
match_type: regexp # or "strict

@andrzej-stencel
Copy link
Member

andrzej-stencel commented Feb 16, 2024

I'm not sure what the problem is. Can you try this:

exporters:
  debug:
    verbosity: detailed
receivers:
  hostmetrics:
    scrapers:
      process:
        include:
          match_type: regexp
          names:
          - otel
        mute_process_exe_error: true
        mute_process_name_error: true
service:
  pipelines:
    metrics:
      exporters:
      - debug
      receivers:
      - hostmetrics

You should get process metrics for processes that match the "otel" regular expression. You should get the process name in the resource attribute process.executable.name. Here's the result on my machine:

C:\>otelcol-contrib_0.94.0_windows_amd64.exe --config .\0216-otc-hostmetrics.yaml
2024-02-16T20:40:27.091+0100    info    [email protected]/telemetry.go:59 Setting up own telemetry...
2024-02-16T20:40:27.091+0100    info    [email protected]/telemetry.go:104        Serving metrics {"address": ":8888", "level": "Basic"}
2024-02-16T20:40:27.092+0100    info    [email protected]/exporter.go:275        Development component. May change in the future.        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
2024-02-16T20:40:27.094+0100    info    [email protected]/service.go:140  Starting otelcol-contrib...     {"Version": "0.94.0", "NumCPU": 8}
2024-02-16T20:40:27.094+0100    info    extensions/extensions.go:34     Starting extensions...
2024-02-16T20:40:27.095+0100    info    [email protected]/service.go:166  Everything is ready. Begin running and processing data.
2024-02-16T20:40:27.095+0100    warn    localhostgate/featuregate.go:63 The default endpoints for all servers in components will change to use localhost instead of 0.0.0.0 in a future version. Use the feature gate to preview the new default.   {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-02-16T20:40:28.169+0100    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 4, "data points": 6}
2024-02-16T20:40:28.170+0100    info    ResourceMetrics #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.9.0
Resource attributes:
     -> process.pid: Int(26284)
     -> process.parent_pid: Int(18128)
     -> process.executable.name: Str(otelcol-contrib_0.94.0_windows_amd64.exe)
     -> process.executable.path: Str(C:\otelcol-contrib_0.94.0_windows_amd64.exe)
     -> process.command: Str("C:\otelcol-contrib_0.94.0_windows_amd64.exe")
     -> process.command_line: Str("C:\otelcol-contrib_0.94.0_windows_amd64.exe" --config .\0216-otc-hostmetrics.yaml)
     -> process.owner: Str(ANSTMATE\andrz)
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/hostmetricsreceiver/process 0.94.0
Metric #0
Descriptor:
     -> Name: process.cpu.time
     -> Description: Total CPU seconds broken down by different states.
     -> Unit: s
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> state: Str(user)
StartTimestamp: 2024-02-16 19:40:24.842 +0000 UTC
Timestamp: 2024-02-16 19:40:28.169173 +0000 UTC
Value: 0.515625
NumberDataPoints #1
Data point attributes:
     -> state: Str(system)
StartTimestamp: 2024-02-16 19:40:24.842 +0000 UTC
Timestamp: 2024-02-16 19:40:28.169173 +0000 UTC
Value: 0.687500
Metric #1
Descriptor:
     -> Name: process.disk.io
     -> Description: Disk bytes transferred.
     -> Unit: By
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> direction: Str(read)
StartTimestamp: 2024-02-16 19:40:24.842 +0000 UTC
Timestamp: 2024-02-16 19:40:28.169173 +0000 UTC
Value: 1189724
NumberDataPoints #1
Data point attributes:
     -> direction: Str(write)
StartTimestamp: 2024-02-16 19:40:24.842 +0000 UTC
Timestamp: 2024-02-16 19:40:28.169173 +0000 UTC
Value: 160
Metric #2
Descriptor:
     -> Name: process.memory.usage
     -> Description: The amount of physical memory in use.
     -> Unit: By
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2024-02-16 19:40:24.842 +0000 UTC
Timestamp: 2024-02-16 19:40:28.169173 +0000 UTC
Value: 116998144
Metric #3
Descriptor:
     -> Name: process.memory.virtual
     -> Description: Virtual memory size.
     -> Unit: By
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2024-02-16 19:40:24.842 +0000 UTC
Timestamp: 2024-02-16 19:40:28.169173 +0000 UTC
Value: 82403328
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}

Note the lines:

     -> process.executable.name: Str(otelcol-contrib_0.94.0_windows_amd64.exe)
     -> process.executable.path: Str(C:\otelcol-contrib_0.94.0_windows_amd64.exe)
     -> process.command: Str("C:\otelcol-contrib_0.94.0_windows_amd64.exe")
     -> process.command_line: Str("C:\otelcol-contrib_0.94.0_windows_amd64.exe" --config .\0216-otc-hostmetrics.yaml)

These resource attributes have the executable name in various forms.

@bhupenbisht
Copy link
Author

Hi @astencel-sumo .. i am getting the process name now.. but only problem is that these process are not showing in our Prometheus endpoint. Getting overall process usage, but not process wise details.

@andrzej-stencel
Copy link
Member

but only problem is that these process are not showing in our Prometheus endpoint

How do you send data from the Otel Collector to Prometheus?

If you use the Prometheus exporter, make sure to set the resource_to_telemetry_conversion.enabled to true in the exporter's configuration.

@bhupenbisht
Copy link
Author

bhupenbisht commented Feb 19, 2024

config1
config2
config3
i am using otel-contrib to send data into prometheus endpoint

@andrzej-stencel
Copy link
Member

This looks like you use the OTLP/HTTP exporter, meaning that on the Prometheus side, you are using the experimental OTLP Receiver feature of Prometheus. I myself haven't used this feature yet.

The endpoint in your configuration, https://myikp/Otel, does not look like the Prometheus' OTLP receiver endpoint, which according to the docs is exposed at path /api/v1/otlp/v1/metrics. Is this myikp endpoint an intermediary between the Otelcol and Prometheus? I'm trying to understand the situation a bit more.

@bhupenbisht
Copy link
Author

@astencel-sumo yes.. you are correct we are using OTLP receiver feature.. myikp endpoint an intermediary between the otelcol and prometheus

@andrzej-stencel
Copy link
Member

In this case, my guess is that the problem might be on Prometheus side. I'm pretty sure the OTLP/HTTP exporter exports the data correctly. If the data looks the way you want in the debug exporter, I'd assume the Otelcol side is OK.

I'd advise raising an issue in Prometheus.

@bhupenbisht
Copy link
Author

Thanks a lot astencel for your support and help. I will check with support team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants