Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bound ports are not released immediately when container is stopped/deleted #2536

Open
rfay opened this issue Aug 7, 2024 · 13 comments
Open

Comments

@rfay
Copy link
Contributor

rfay commented Aug 7, 2024

Description

This is a replacement for

since the problem is in Lima.

When a container is stopped, its bound ports are not immediately released. It can take 2 seconds or more for that to happen. DDEV has had to add a sleep after deleting a container to wait for the port to be released.

@abiosoft
Copy link
Contributor

abiosoft commented Aug 7, 2024

Considering that Lima scans the VM periodically for opened ports, there is not an easy way out.

An option could be for Lima to support dynamic port forwarding.
Colima can monitor docker events and leverage that to forward (and stop forwarding) ports as needed.

@rfay
Copy link
Contributor Author

rfay commented Aug 7, 2024

Thanks. But it seems every other Docker provider does this successfully and immediately.

@Nino-K
Copy link
Contributor

Nino-K commented Aug 7, 2024

We have dynamic port forwarding in Rancher Desktop that listens for events from both Docker and containerd, which could potentially be leveraged in Lima.

@jandubois
Copy link
Member

We have dynamic port forwarding in Rancher Desktop

Note that it is part of the WSL2 guest agent, so needs to be ported over to Lima. But the structure should be very similar.

@balajiv113
Copy link
Member

balajiv113 commented Aug 8, 2024

@jandubois
How about we use LD_PRELOAD with a custom bind logic to get callback within lima ??

Just a initial thought not sure if this works for all cases (like docker, kubernetes and all). But if this works, we should be able to get instant callback before the actual bind happens.

Edit:
This model doesn't work for docker, containerd etc as they are in different network namespace.

@jandubois
Copy link
Member

How about we use LD_PRELOAD with a custom bind logic to get callback within lima ??

I feel a bit nervous of this introducing hard-to-find bugs, and wonder how this will work for different versions of glibc and musl.

And I guess your edit says that it won't work anyways as soon as network netspaces come into play.

So I would say we should integrate the code from the Rancher Desktop guest agent into Lima; it does seem to work quite well. IIRC the Rancher Desktop guest agent is an early fork of the Lima version, so should have the same basic structure for scraping port usage. Let's wait to see when @Nino-K has some time to port it over.

@balajiv113
Copy link
Member

I feel a bit nervous of this introducing hard-to-find bugs, and wonder how this will work for different versions of glibc and musl.

True. Had similar concerns as well. Since it doesn't work for multiple namespace there is no benefit out of it

So I would say we should integrate the code from the Rancher Desktop guest agent into Lima; it does seem to work quite well.

Am completely fine with it, my major concern is for each of those custom namespace we need to write logics accordingly. its not a generic solution that we can build once and be assured that it will work always.

Another consern is, increasing dependency in guest. We do already use kubernetes. Now will need to add containerd, docker clients as well. Later something else might come up as well.

I was exploring some other generic solutions.
Option - Using go-libaudit to find bind events
We can use go-libaudit to find the bind events with a custom rule. Its basically based on audit log, we will be getting the bind events and can use it as a callback for our event forwarding.
This should work for all cases (Tested with host binding , docker based binding both worked). Also we are already using this library in our guest agent

What do you think about this approach ??

@balajiv113
Copy link
Member

balajiv113 commented Aug 9, 2024

@rfay - Can you try with this PR #2411

This revamps the portforwarding using GRPC tunnels. As per this issue we are talking only about releasing the port immediately which should be already supported here.

We will remove the listener from host as soon as stream is closed on either side of communication

Note: Its still using polling model for identifying ports. But as soon as stream is closed it closes listener and at the time ports_removed event it will be a no_op

Edit: Ignore it doesn't work. As TCP can have multiple connections. We don't know when to close it

@jandubois
Copy link
Member

We will remove the listener from host as soon as stream is closed on either side of communication

@balajiv113 I don't understand how this would work. For example the Kubernetes apiserver is listening on 6443 inside the guest, so we forward the port to the host.

Now e.g. kubectl runs a command and talks to the apiserver. Once kubectl exits, you close the stream, but don't you have to continue to listen for new incoming connections? Because the user might run another kubectl command after the first.

I would only understand this if you said you remove the listener on the host as soon as the listener in the guest stops listening. But how will you know this unless/until you try to connect?

What am I missing?

@jandubois
Copy link
Member

We can use go-libaudit to find the bind events with a custom rule.

This sounds like a good possibility. I vaguely remember that there were issues with auditd on Fedora, but can't recall any specifics. Maybe it also has all been resolved.

@rfay
Copy link
Contributor Author

rfay commented Aug 28, 2024

I don't really understand why a bound port can't be removed when the container that bound it is stopped, as all non-Lima-based Docker providers do successfully. (Note that Rancher Desktop has this same problem, as I finally discovered yesterday.)

I seem to have "fixed" this problem, which badly affected DDEV's automated tests, with a waitForPortsToBeReleased(), https://github.com/ddev/ddev/blob/27b1fa0520cee01d4d7afbd55c29a7054163a3d3/pkg/ddevapp/router.go#L99-L130

@jandubois
Copy link
Member

Am completely fine with it, my major concern is for each of those custom namespace we need to write logics accordingly. its not a generic solution that we can build once and be assured that it will work always.

That is true, but how many container engines do you expect we'll need to support?

I would still say it is worth the effort to port over the changes from the Rancher Desktop guest agent on Windows, and solve this issue for virtually all current users. This will also work for users on distros whose kernels are missing the modules for eBPF or libaudit.

We could still add an ePBF/libaudit based alternative later, as a generic fallback (or better as a configurable option).

Another consern is, increasing dependency in guest. We do already use kubernetes. Now will need to add containerd, docker clients as well. Later something else might come up as well.

I'm concerned about this as well, but looking at the current Rancher Desktop guest agent, it seems to be the same size as the Lima guest agent:

/ # ls -lh /usr/local/bin/rancher-desktop-guestagent
-rwxr-xr-x    1 root     0          47.4M Dec 31  1969 /usr/local/bin/rancher-desktop-guestagent

and

$ ls -lh _output/share/lima/lima-guestagent.Linux-x86_64
-rw-r--r--  1 jan  staff    46M 19 Sep 17:42 _output/share/lima/lima-guestagent.Linux-x86_64

I haven't checked if maybe we stripped the symbols from the RD agent, but anyways, we end up with a binary of the same size as what we already have in Lima.

@balajiv113
Copy link
Member

Completely fine in doing what rancher does for now (As long as size doesn't become too huge and we don't add yet another new container)

On other hand, Ebpf + libaudit changes are almost ready. But i would need some more time to check on different aspects. So once this is ready we will kind of follow the configuration pattern to enable or disable this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants