Node exporter in k8s env reports error gathering metrics: was collected before with the same name and label values #1571

mpashka · 2019-12-19T14:56:03Z

Host operating system: output of `uname -a`

Linux lga-kubnode470 4.4.205-1.el7.elrepo.x86_64 #1 SMP Fri Nov 29 10:10:01 EST 2019 x86_64 x86_64 x86_64 GNU/Linux

node_exporter version: output of `node_exporter --version`

node_exporter, version 0.18.0 (branch: HEAD, revision: f97f01c)
build user: root@77cb1854c0b0
build date: 20190509-23:12:18
go version: go1.12.5

node_exporter command line flags

--path.procfs=/host/proc
--path.sysfs=/host/sys
--collector.textfile.directory=/var/log/pulsepoint/prometheus

Are you running node_exporter in Docker?

Node exporter is running in docker in kubernetes cluster:
image: "prom/node-exporter:v0.18.0"

Node exporter daemonset manifest was created from stable helm chart:
There are 3 volumes:
name: proc
hostPath: /proc -> container readonly: /host/proc
hostPath: /sys -> container readonly: /host/sys
hostPath: /var/log/pulsepoint/prometheus -> container: /var/log/pulsepoint/prometheus (used for text files metrics)
No kubernetes security context
hostNetwork: true
hostPID: true

What did you do that produced an error?

We are running k8s cluster with kube-router based networking. So we are heavily using ipvs.
At some point we started to observe significant number of errors from all node-collectors about duplicate ipvs metrics (starting and heaving max value at 7:00 AM and then fading within an hour):

time="2019-12-19T12:25:10Z" level=error msg="
error gathering metrics: 324 error(s) occurred:
* [from Gatherer #2] collected metric "node_ipvs_backend_connections_active" { label:<name:"local_address" value:"10.203.128.184" > label:<name:"local_port" value:"9001" > label:<name:"proto" value:"TCP" > label:<name:"remote_address" value:"10.204.57.184" > label:<name:"remote_port" value:"9001" > gauge:<value:0 > } was collected before with the same name and label values
...
" source="log.go:172"

And we have many of that type for node_ipvs_backend_weight, node_ipvs_backend_connections_active, node_ipvs_backend_connections_inactive
We have scrape interval 15 seconds and 500 nodes cluster.

What did you expect to see?

No errors

What did you see instead?

Bunch of errors starting from some specific time with the same pattern: start at 7:00 AM and then fading within an hour.

The text was updated successfully, but these errors were encountered:

discordianfish · 2020-01-09T13:28:12Z

Can you paste all lines for node_ipvs_backend_connections_active from the/metrics output of the node-exporter?
I suspect they are not unique.

mpashka · 2020-01-09T14:32:31Z

Issue is reproduced under some circumstances. I'll try to download /metrics. But yes, lines are not unique. And the reason is /proc/net/ip_vs_stats records are not unique as well. And that is caused by the fact ipvs for specific service is reconfigured very often (because there are many pods and they change their ready state at the same moment).

discordianfish · 2021-03-30T15:36:47Z

We should probably collect the ipvs stats atomically somehow.

rexagod · 2024-03-20T17:31:40Z

Constraining the number of CPU threads that various concurrent operations can be scheduled on, to one, is very likely to have already resolved this.

Two cores being served with (non-atomic) Update requests at the same time may lead to the same timestamp being generated for two metrics, however, with GOMAXPROCS defaulting to one now, I believe this shouldn't happen anymore.

discordianfish · 2024-03-21T12:34:10Z

Ok closing. If this still happens, feel free to re-open

discordianfish added accepted bug require/feedback labels Jan 9, 2020

discordianfish removed the require/feedback label Mar 30, 2021

discordianfish closed this as completed Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node exporter in k8s env reports error gathering metrics: was collected before with the same name and label values #1571

Node exporter in k8s env reports error gathering metrics: was collected before with the same name and label values #1571

mpashka commented Dec 19, 2019 •

edited

Loading

discordianfish commented Jan 9, 2020

mpashka commented Jan 9, 2020

discordianfish commented Mar 30, 2021

rexagod commented Mar 20, 2024

discordianfish commented Mar 21, 2024

Node exporter in k8s env reports error gathering metrics: was collected before with the same name and label values #1571

Node exporter in k8s env reports error gathering metrics: was collected before with the same name and label values #1571

Comments

mpashka commented Dec 19, 2019 • edited Loading

Host operating system: output of uname -a

node_exporter version: output of node_exporter --version

node_exporter command line flags

Are you running node_exporter in Docker?

What did you do that produced an error?

What did you expect to see?

What did you see instead?

discordianfish commented Jan 9, 2020

mpashka commented Jan 9, 2020

discordianfish commented Mar 30, 2021

rexagod commented Mar 20, 2024

discordianfish commented Mar 21, 2024

mpashka commented Dec 19, 2019 •

edited

Loading

Host operating system: output of `uname -a`

node_exporter version: output of `node_exporter --version`