Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep local dns in resolv.conf when host network enabled #4524

Merged
merged 1 commit into from
Feb 9, 2024

Conversation

crazy-max
Copy link
Member

replaces and closes #3244
fixes #3210
related to docker/buildx#175 (comment)

@crazy-max crazy-max force-pushed the host-net-local-dns branch 3 times, most recently from feb6ead to a0571fc Compare January 5, 2024 14:06
@crazy-max crazy-max marked this pull request as ready for review January 5, 2024 14:36
executor/containerdexecutor/executor.go Outdated Show resolved Hide resolved
executor/oci/resolvconf.go Outdated Show resolved Hide resolved
executor/oci/resolvconf.go Show resolved Hide resolved
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ updates needed per comments above

@crazy-max crazy-max force-pushed the host-net-local-dns branch 3 times, most recently from ab18978 to 032813c Compare January 22, 2024 13:57
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

@tonistiigi tonistiigi added this to the v0.13.0 milestone Feb 9, 2024
@tonistiigi tonistiigi merged commit 2873353 into moby:master Feb 9, 2024
63 checks passed
@crazy-max crazy-max deleted the host-net-local-dns branch February 10, 2024 09:00
@TBBle
Copy link
Collaborator

TBBle commented Feb 11, 2024

So late to the party here, but after some testing with buildx, I'm wondering if meta.NetMode == pb.NetMode_HOST is the correct condition here, since if the worker is in host-network mode (as seen in the buildx docker-container driver), then its pb.NetMode_UNSET network provider will also be host-mode, but that (a) already didn't require --allow-insecure-entitlement network.host, and (b) now also won't get the host-copied resolv.conf.

In this case, the custom network is reachable by IP, so I am assume that we are truely using host-mode for those RUN commands, but fail to recognise that at this point in the code. Up until this PR, the difference was only cosmetic (absence of a "enabling HostNetworking" log message when meta.NetMode == pb.NetMode_UNSET, but still using the host provider in that case). (Although it's a bit odd that if you don't pass --allow-insecure-entitlement network.host, then simply by going with the defaults you get to use host-networking, but not if you're explicit).

Basically, this fix only works for the buildx docker-container driver case if you also pass --network host to the build, and that's surprising to me.

Although I see now that was explicitly dismissed in #4524 (comment) so maybe the issue needs to be addressed on the buildx side, e.g., when a custom network is specified for a docker-container builder, it automatically allows the entitlement and inserts --network host into each build call for that builder. (Maybe it should always do that, since in that code the user doesn't configure the buildkit networking behaviour, they configure the container's network behaviour, and rely on buildkit always running in host mode, when it matters. Although custom buildkitd.toml messes with that, as seen in the CNI example in the docs.)

Or would a worker's explicit network mode carry through into the RUN commands, and the problem is that the worker is being left to default to auto (currently host) in the first place. (Although I didn't notice anywhere that stores the worker's specified network mode, we only seem to keep the resulting provider map...)

@tonistiigi
Copy link
Member

I'm wondering if meta.NetMode == pb.NetMode_HOST is the correct condition here, since if the worker is in host-network mode (as seen in the buildx docker-container driver), then its pb.NetMode_UNSET network provider will also be host-mode,

This is intentional. See #4524 (review) When network mode is unset then user should not make expectations about host DNS or interfaces being accessible by the container. Only thing they can expect is access to internet. Eg. we are changing the default to bridge #4352 in the future release for better security and to avoid collisions between containers.

it automatically allows the entitlement and inserts --network host into each build call for that builder.

Indeed, in a container context, I think we could allow daemon-side entitlement automatically because "host" there already means container-host, not machine-host. But no --network host automatically as that would change the default execution context. If users used docker/buildx#175 for custom libnetwork networks in legacy builder then they already passed --network so setting --network host in the build should not add any additional complexity.

@TBBle
Copy link
Collaborator

TBBle commented Feb 12, 2024

I'm wondering if meta.NetMode == pb.NetMode_HOST is the correct condition here, since if the worker is in host-network mode (as seen in the buildx docker-container driver), then its pb.NetMode_UNSET network provider will also be host-mode,

This is intentional. See #4524 (review) When network mode is unset then user should not make expectations about host DNS or interfaces being accessible by the container. Only thing they can expect is access to internet. Eg. we are changing the default to bridge #4352 in the future release for better security and to avoid collisions between containers.

Yeah, I can see the logic of this on the BuildKit side. For buildx, the user is making an explicit request that the custom network be available to the build, so I guess that makes it a buildx responsibility.

Perhaps buildx could actually hook the custom network up to a bridge, and just pass the bridge name in, taking advantage of #4352, and avoiding the "host-networking" in RUN and the discrepancy between RUN and COPY.

it automatically allows the entitlement and inserts --network host into each build call for that builder.

Indeed, in a container context, I think we could allow daemon-side entitlement automatically because "host" there already means container-host, not machine-host. But no --network host automatically as that would change the default execution context. If users used docker/buildx#175 for custom libnetwork networks in legacy builder then they already passed --network so setting --network host in the build should not add any additional complexity.

I think this adds mild additional complexity simply because for a user building against a custom network, --network host is a semantic mismatch, it only happens to be "host" because the builder was set up inside a container. It just feels fiddly to me that if you've already created the builder on a custom network, then even supporting (and validating/enforcing) --network custom-network-name feels slightly repetitive. But I guess there are use-cases where you want ADD/COPY to see the custom network, but RUN to not see it...

Anyway, this sounds like a buildx thing too, so I guess that answers my questions as far as this PR is concerned. Thank you.

@penenkel
Copy link

@crazy-max I have a question regarding this change: Where does the file resolv-host.conf come from? Is that a file that is provided automatically (if so which component is responsible for that?) or do I have to ensure its existence myself?

@TBBle
Copy link
Collaborator

TBBle commented Feb 28, 2024

As far as I can see, it should be generated the first time through GetResolvConf, and any other time it needs to be updated, when running in (explicit) host-network mode.

@crazy-max
Copy link
Member Author

@crazy-max I have a question regarding this change: Where does the file resolv-host.conf come from? Is that a file that is provided automatically (if so which component is responsible for that?) or do I have to ensure its existence myself?

It's not user-facing, just internal to handle cache for resolve conf in state dir.

@penenkel
Copy link

Ah, right, I misread the code. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add config options that allow container DNS to point to localhost
6 participants