-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v32.0 - Failed to get system UUID: open /etc/machine-id: no such file or directory (Centos 7.6) #2157
Comments
Can you be more specific when you say "nothing works"? Does the kubelet simply log that and then exit, or does it hang somewhere, or does it just never report any container metrics? |
@dashpole I get that single line of logging and it's DOA until I stop the container. I cannot access the webui at all. |
I have looked through the changes between those versions, but can't see anything immediately that would affect this. This must be hanging in the The only thing that happens between then is reading the machine id (/etc/machine-id) and the boot id (/proc/sys/kernel/random/boot_id), and getting all of the cloud info from the cloud-provider. |
Same problem here with Centos 7.6: |
Same problem using the compiled binary directly on a Amazon Linux AMI 2018.03: 4.14.123-86.109.amzn1.x86_64 #1 SMP x86_64 x86_64 x86_64 GNU/Linux
|
v0.33.0 Same problem. |
same on ubuntu 18.04 |
v0.34,0 and v0.36.0 same problem. I've tried v0.34.0 and v0.36.0 under Alpine 3.10 in docker container under Windows 10 with same |
I'm getting this too. It just started out of the blue...
|
same issue with v0.37.0: 16693 info.go:53] Couldn't collect info from any of the files in "/etc/machine-id,/var/lib/dbus/machine-id" adding 2 more volumes removes the messages:
|
Same issue too |
Got this error: Fixed by adding the following to my docker-compose file:
As suggested by @Constantin07 😃 |
To help others.... QNAP in QTS does not STORE it's UUID in /etc/machine-id so it of course CAN'T be read (producing the same open issue described). I you can decode the machine system info which INCLUDES the UUID using: And then manually write the extracted UUID to the expected file /etc/machine-id Now on to my next issue.... seems to fail for lack of an Anti-Virius group on the host My QTS 5.x based QNAP only has the following directories (please excuse the slight forking of the original issue as perhaps related to my attempted workaround of the UUID issue, though HIGHLY doubtful)....
Any ideas other than regress version of cAdvise too one before issues started? |
Apologies for the bad form rreplieing to my own post but thought again this might be helpful for others... After MUCH trial & error combined with a good deal of research I stumbled upon a workaround for my final remaining issue above. Seem if you pass all the /sys data it is looking for as individual volumes you can work around the issue I had... per issue 574760369. Ugly as stated but it did work (and I made it a bit less ugly by revising to pass the whole /sys/fs/cgroup/ rather then the individual directories I and the author of the fix had earlier used. I suppose I should cross post in that thread but honestly don't understand the reported error well enough to be sure where to post the issue so perhaps I should just leave the issue here as is. The issue I mentioned is ALSO still open and the documentation on cAdvisor seems to be lacking in a number of areas regarding the Docker implementation and configuration of monitoring. Hope this helps someone out there save some time (and hair) ;) |
nub question. how do u go about doing that? "Seem if you pass all the /sys data it is looking for as individual volumes you can work around the issue"
|
As was mention in the url I quoted (and you included in your reply).... Once caveat though... Given I am now on ZFS I can say that my CAdmin config while it works throw up a ton of out of memory msg in the console and will actually block snapshot management/deletion... Just don't have the time to dig into this right now so I have had to write off CAdvise (resource overhead combined with file system blockage) |
thx for the reply. well i use glances now, seems low resource. i also see a friend use netdata which also is low on resource consumption. so that is the direction i probably will head for. |
I solved this by adding a bind mount and device to my docker-compose file: i also added the /etc/machine-id:/etc/machine-id:to which solved the other error message of no /etc/machine-id |
- Used the following Dockerfile as the main example - https://github.com/oijkn/Docker-Raspberry-PI-Monitoring/blob/760528af93b2d5ce3a2025a6c7beb90f3dd3c27c/docker-compose.yml#L22-L44 - Note that it differs from the docker command mentioned in the cAdvisor README quick-start due to the fixes mentioned further below - https://github.com/google/cadvisor?tab=readme-ov-file#quick-start-running-cadvisor-in-a-docker-container - Fixed "Failed to get system UUID" error by adding `/etc/machine-id` volume mount - google/cadvisor#2157 (comment) - Fixed container names not included on mac by adding `/var/run/docker.sock` volume mount - google/cadvisor#1565 (comment) - Note that `ro` was sufficient and `rw` wasn't needed as mentioned in google/cadvisor#1565 (comment) - Fix cAdvisor high memory usage with oijkn/Docker-Raspberry-PI-Monitoring#34 - Note that every 5 minutes, two error messages are logged. This is a known issue with cAdvisor:v0.49.1 that will hopefully be fixed in a future release - google/cadvisor#3493 - Add cAdvisor metrics to the prometheus.yml file - I initially tried to use port 8081 in this file, but it didn't work. The following answer led me to realize that it needed to be port 8080, since 8081 is only used on the host machine's network, not within the docker routing network itself - https://stackoverflow.com/q/54397463 - Explanation of `privileged: true` - https://thelinuxcode.com/privileged-in-docker-compose-with-code-examples/ - cAdvisor docs on prometheus - https://github.com/google/cadvisor/blob/master/docs/storage/prometheus.md
OS: CentOS Linux release 7.6.1810
SELinux is active
docker-compose.yml:
If I use v32.0, I get this output and nothing works
If I use v31.0, I get this output and it appears things work fine (I can access the webui)
The error still exists in v31, it just doesn't seem to be a show stopper. Any suggestions?
The text was updated successfully, but these errors were encountered: