-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTBR Addon Crashing #132124
Comments
Have the exact same problem, also on Supervised |
The only problem is Supervised |
Agreed with this - although I'm not sure that it's on the Supervised side. I spun up a stock standard openthread/otbr Docker image with the hassio network_driver and it also worked flawlessly I lied - I restored version 2.9.0 of homeassistant/otbr from a backup and it's producing the same error so I'm now convinced it's related to new permissions / ingress requirements at the system level. |
I had actually been running on my Supervised, but suddenly it stopped working because of this permission thing. If it was an update to HA or something else I don't know |
For permission error on supervised, try adding your user to the |
I just tried that from underlying debian and seems to be no change. |
I have added myself to netdev in Home Assistant terminal: Which user starts the addin ? |
It doesn't work |
I not sure what could be causing this then. I do have a supervised HA box, however it does not suffer this permission error. |
@darkxst |
Debian 12/Bookworm (on x86_64) |
The netdev group does not exist in the Addon container. You can verify this by running I am not sure if anything has changed in the Apparmor profiles or between dist updates (debian has pushed to 12.8 Nov. 9) which roughly coincides with when the addon stopped working. For reference, I am running debian 12.8 (6.1.0-22-amd64); supervised. I have tried: reinstalling supervised, re-running docker install script, re-pulling both the HA Core container and OTBR containers without avail. I can confirm that the stock OpenThread/OTBR image spins up just fine and speaks freely between HA and my SkyConnect. I would use this setup except for the Thread Network persistence issue between container restarts. |
49d.17:36:05.950 [C] Platform------: platformConfigureTunDevice() at netif.cpp:2022: Operation not permitted |
Ahhh, thats was anther error, solved by: |
This happened to me when I upgraded As a temporary solution, I downgraded to apt list -a containerd.io # list installed, pick the one with 1.7.23
sudo apt install containerd.io=1.7.23-1
sudo apt-mark hold containerd.io # do not upgrade it, can be skipped |
Can confirm this as a solution. Just downgraded containerd.io to 1.7.23-1 and it now loads just fine. There are a number of subtle changes on this version however I'm going to bet it's related to the cgroup changes (#10814). |
Can also confirm it, this is simply great |
Thanks, this solution works |
I restored previous HA versions and it wasn't working. Downgraded OTBR to 2.11.0 and things were back. Updating HA to 2024.12 leaves OTBR running. So something changed in 2.12. |
Bonjour, Je viens seulement d'installer HA et d'installer la dernière version d'OTBR. J'aimerai donc pouvoir installer à la place la version 2.11.0 car je rencontre le même problème. Je n'y connais vraiment pas grand chose mais après avoir cherché partout je ne sais pas comment forcer la version 2.11.0 vu que je n'ai jamais été sous cette version. Comment avez-vous procédé ? Y a t'il un fichier téléchargeable quelque part (je ne trouve que les dernières versions...).Merci de votre aide |
Bonjours, je réponds en anglais pourque la discussion reste compréhensible pour tout le monde. First of all, apply the latest HA update 2024.12.2. then also update OTBR and you should be ok. Note OTBR 2.11 wasn't working reliably with neither HA 2024.12. nor 2024.12.2. So something was non-functional between HA and OTBR. I am confident that updating to the latest HA version will solve your issues. Unless you can't restore previous functional setup, you will need to download old versions of HA and OTBR, install them and start from scratch. |
Sorry I forgot i was reading a translation. Thank you for your help. My version is already HA 2024.12.2 and OTBR 2.12.2. So I guess my problem is somewhere else... Thank you again |
Hhm, what exactly is the issue? Does IT BE start and then stops again? You may want to post the OTBR log for the experts to take a look. |
Yes it starts and then it stops. Maybe my material isnt compatible... My OTBR log is : [12:37:25] INFO: The otbr-web is disabled.
|
Oh, you know what? It has stopped here too! 🤔 It was working yesterday, after the HA update, as I was able to unlock one of my intelligent window handles. The only thing I did was to update "Terminal & SSH". So the trouble starts with the HA restart. Manually relaunch of OTBR doesn't help. Same errors as yours. |
Its the first time i try to install all of this, so i thought i was doing something bad... Im a bit reassured to know that its note necessarily my fault 😅 |
I just followed these instructions and it seems to be working? |
i saw this instructions earlier but was not sure about that... but it seems to be ok ! Its the first time I saw it start so now, i will test it. Thank you both !! |
It does indeed. I didn't block it from updating though. Let's see how things develop. |
This is the key, I had a similar issue in my other docker-compose service that uses OpenVPN inside, and it can be fixed there with one of the following:
devices:
- /dev/net/tun This worked for my docker-compose, but unfortunately I couldn't find an easy way to do the same for the OTBR container, so I downgraded to |
The blunt-tool workaround is to disable "Protection Mode" for the container. This effectively allows unrestrained access from the container to the host system. This confirms it's a container permissions issue, likely relating to access to /dev/net/tun device and the device tree itself. Make sure you have the OTBR addon installed and configured (ie device is selected etc.)
Scroll to the OTBR container entry and go to the Save the file and exit. DO NOT DO ANYTHING MORE IN HA. Run a |
Will this survive future HA updates? |
No, it will not. There is a slightly better version of doing this; and that involves editing the config.yaml file for the addon itself. You must completely uninstall the addon first. The supervised addon installer uses a config file to define the addons.json file.
To this file, add: Restart your system. Reinstall the addon. This will now add the Privileged Mode access to the tab, and should survive an uninstall and reinstall of the container. However, it will not survive an update! The issue is broader than simply this addon. As I suspected, the runc version changes has impacted containerd.io which effectively removes /dev/net/tun from CAP_NET_ADMIN. Please see these related posts: here and here Unfortunately adding:
to the addon config.yaml does NOT resolve the issue. So at this point, the container has to be run in privileged mode to allow it to access the tun device tree. sighs |
Could you please double check? Because as for me, just adding the What I did:
--- a/openthread_border_router/config.yaml
+++ b/openthread_border_router/config.yaml
@@ -20,6 +20,8 @@ host_uts: true
privileged:
- IPC_LOCK
- NET_ADMIN
+devices:
+ - /dev/net/tun
image: homeassistant/{arch}-addon-otbr
init: false
options:
After that, it just worked (on Debian 12, HA Supervised, SLZB-06m as border router). |
You need a certain tech level to apply such changes., e.g. where is the OTBR config info stored. Not exactly what the average HA user is able to do. Keeping containerid.io at 1.7.23 and blocking it from upgrading seems both easier and safer. |
Dont modify the installed core addon, instead copy edit config.yaml
Now go to the https://developers.home-assistant.io/docs/add-ons/testing/ (only the path differs on supervised installed) |
Yes - I thought I was losing my mind here for a second - as I tried it and the container loaded (I did this last evening too although restarted before trialling). Adding /dev/net/tun in the config.yaml file only seems to work transiently; I think it's cause supervisor is loading an old image (ie with privileged access) or permission set rather than something inherent to /dev/net/tun. This doesn't seem to survive a system restart. Can you confirm this? I can reliably have the addon load when privilege mode is enabled for the container instead. |
editing the core addon, it could still be downloading the docker image instead of rebuilding the addon. Follow my steps above to install as local addon (with the |
This does not make a difference as both a local and addon version are based on the same docker image (an Alpine image with the same dockerbuild file); it's the permissions set that the container is loaded with that dictates how it will behave - and this is called from both the config.yaml or build.yaml. Spinning off a local copy will simply limit interference from any version pushes or system upgrades (which may be a solution for some) - for others, simply unticking the Auto update option is probably sufficient. |
I cannot confirm this, it has survived several reboots and HA core upgrade on my machine. I guess it's only overwritten by an OTBR addon update.
In my case it doesn't matter, the downloaded docker image is fine, there is no need to rebuild it, so I didn't bother with local copy of the addon (as any changes are easily recoverable from github sources). |
I have tried modifying After installing and configuring, both return the same error: (49d.21:00:42.056 [C] Platform------: platformConfigureTunDevice() at netif.cpp:2022: Operation not permitted I have restarted Home Assistant. Should I restart the machine as well in order for this to work? |
When will the problem be actually repaired so we do not need to hack things to have a working system? |
Sorry for noticing this late. Since this is a Home Assistant Add-on issue, this should actually be an issue reported at https://github.com/home-assistant/addons/issues. This actually got resolved with the containerd.io Debian package 1.7.25-1. Just in case containerd reverts back to not adding support I've also merged the add-on change to explicitly add permissions for the tun device (see home-assistant/addons#3864). Thanks for digging out the details of why this happened on Supervisor and the fix! |
I just updated to the latest version and there is no change. It still crashes with the same error. "50d.22:11:56.846 [C] Platform------: platformConfigureTunDevice() at netif.cpp:2022: Operation not permitted EDIT: Apologies, I just updated the system to pull the latest containerd and it's working. Thanks!! |
Uh, both, the OTBR update and the latest containerd should solve the problem. I am a bit surprised that you still saw the issue after updating the OTBR. In my tests, containerd.io 1.7.24-1 and the OTBR 2.12.4 did work here 🤔 |
Not sure how I can help, but either way the issue does seem resolved. I don't think people can expect things to continue working unless they install the updates. |
I've looked a bit closer again with my setup, and I realized that when Supervisor gets started without the
So essentially, loading the This seems to be limited to Supervised, since Home Assistant OS does not configure it as a module ( |
Thanks so much for your commitment to the community, and your foresight. |
The problem
OTBR Addon fails to load with platformConfigureTunDevice() at netif.cpp:2022: Operation not permitted
What version of Home Assistant Core has the issue?
2024.11.3
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant Supervised
Integration causing the issue
OpenThread Border Router Addon
Link to integration documentation on our website
No response
Diagnostics information
OTBR Addon Crashes during bootup. Looks like a permissions error. This is new behaviour that I have only noticed in the past week or so.
core_openthread_border_router_2024-12-03T00-45-25.316Z.log
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Snapshot from the logs of relevance.
Additional information
No response
The text was updated successfully, but these errors were encountered: