Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arion tries to start a service before podman is up #238

Open
pedorich-n opened this issue Apr 21, 2024 · 2 comments
Open

Arion tries to start a service before podman is up #238

pedorich-n opened this issue Apr 21, 2024 · 2 comments

Comments

@pedorich-n
Copy link
Contributor

pedorich-n commented Apr 21, 2024

Hi!

I've just encountered an error where after reboot arion on NixOS was trying to start the containers before the podman was ready:

2024-04-21T22:24:15+0900 arion-home-automation-start[2418281]: docker compose file: /nix/store/4d2i95x5rhb55dyncssr4xfz87sl8lyh-docker-compose.yaml
2024-04-21T22:24:15+0900 arion-home-automation-start[2418291]: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
2024-04-21T22:24:15+0900 arion-home-automation-start[2418284]: arion: readCreateProcess: docker "images" "--filter" "dangling=false" "--format" "{{.Repository}}:{{.Tag}}" (exit 1): failed

It looks like it auto-restarted eventually.

But this makes me think: would it make sense to add something like After = [ "podman.service" ] (or docker, depending on the backend) or ConditionPathExists=/var/run/docker.sock to the systemd service definition to ensure it starts correctly from the first try?

Edit: More info
NixOS stable 23.05, nixpkgs commit bc194f70731cc5d2b046a6c1b3b15f170f05999c

$ podman --version
podman version 4.7.2

$ arion version
docker-compose version 1.29.2, build unknown
docker-py version: <module 'docker.version' from '/nix/store/7420rvz9fw7cjqkjf5i62zarv8s4p21c-python3.11-docker-6.1.3/lib/python3.11/site-packages/docker/version.py'>
CPython version: 3.11.8
OpenSSL version: OpenSSL 3.0.13 30 Jan 2024
@roberth
Copy link
Member

roberth commented Apr 21, 2024

It does have

after = [ "sockets.target" ];

So that suggests that the podman socket is not registered with systemd.

Maybe it needs to run after tmpfiles too?

https://github.com/NixOS/nixpkgs/blob/ff03bc83894ca42d93f80ec6ea82b9e4eaff02b9/nixos/modules/virtualisation/podman/default.nix#L244

Ideally systemd would know that the docker socket is an alias for the podman socket. I think this could be achieved with multiple ListenStreams in podman: one for each location. That makes the tmpfiles solution seem like a hack.

@pedorich-n
Copy link
Contributor Author

pedorich-n commented Apr 22, 2024

Ah, I didn't realize after = [ "sockets.target" ]; is there to ensure it starts after the socket is present. This makes sense now.

Thanks for the quick fix!
Like you said, race conditions are hard to test, but with your change applied after a couple of reboots, I haven't seen the issue come up again.

roberth added a commit to hercules-ci/nixpkgs that referenced this issue Apr 26, 2024
This ensures that both "sockets" are available after sockets.target.
See hercules-ci/arion#238
nbraud pushed a commit to NixOS/nixpkgs that referenced this issue May 2, 2024
This ensures that both "sockets" are available after sockets.target.
See hercules-ci/arion#238
pull bot pushed a commit to auxolotl/nixpkgs that referenced this issue May 2, 2024
)

This ensures that both "sockets" are available after sockets.target.
See hercules-ci/arion#238
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants