Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(maint) Add network aliases of .internal for containers #61

Merged
merged 3 commits into from
May 6, 2019

Conversation

Iristyle
Copy link
Contributor

@Iristyle Iristyle commented Apr 24, 2019

  • Remove the domain introspection / setting of AZURE_DOMAIN env var
    as this does not work as originally thought.

    Instead, hardcode the DNS suffix .internal to each service in the
    compose stack, and make sure that dns_search for internal will
    use the Docker DNS resolver when dealing with these hosts. Note that
    these compose file settings only affect the configuration of the
    DNS resolver, not resolv.conf. This is different from the
    docker run behavior, which does modify resolv.conf. Also note,
    config file locations vary depending on whether or not systemd is
    running in the container.

    It's not "safe" to refer to services in the cluster by only their
    short service names like puppet, puppetdb or postgres as they
    can conflict with hosts on the external network with these names
    when resolv.conf appends DNS search suffixes.

    When docker compose creates the user defined network, it copies the
    DNS settings from the host to the resolv.conf in each of the
    containers. This often takes search domains from the outside network
    and applies them to containers.

    When network resolutions happen, any default search suffix will be
    applied to short names when the dns option for ndots is not set to 0.
    So for instance, given a resolv.conf that contains:

    search delivery.puppetlabs.net

    A DNS request for puppet becomes puppet.delivery.puppetlabs.net
    which will fail to resolve in the Docker DNS resolver, then be sent
    to the next DNS server in the nameserver list, which may resolve it
    to a different host in the external network. This behaves this way
    because resolv.conf also sets secondary DNS servers from the host.

    While it is possible to try and service requests for an external
    domain like delivery.puppetlabs.net with the embedded Docker DNS
    resolver, it's better to instead choose a domain suffix to use inside
    the cluster.

    There are some good details on how various network types configure:
    Version 18.09: local DNS settings are not inherited anymore, local /etc/hosts have no effect inside containers docker/for-linux#488 (comment)

  • Note that the .internal domain is typically not recommended for
    production given the only IANA reserved domains are .example, .test,
    .invalid or .localhost. However, given the DNS resolver is set to
    own the resolution of .internal, this is a compromise.

    In production its recommended to use a subdomain of a domain that
    you own, but that's not yet configurable in this compose file. A
    future commit will make this configurable.

  • Another workaround for this problem would be to set the ndots option
    in resolv.conf to 0 per the documentation at
    http://man7.org/linux/man-pages/man5/resolv.conf.5.html

    However that can't be done for two reasons:

  • A further, but implausible workaround would be to modify the host DNS
    settings to remove any search suffixes.

  • The original FQDN change being reverted in this commit was introduced
    in 2549f19

    "
    Lastly, the Windows specific docker-compose.windows.yml sets up a
    custom alias in the "default" network so that an extra DNS name for
    puppetserver can be set based on the FQDN that Facter determines.
    Without this additional DNS reservation, the puppetserver ca
    command will be unable to connect to the REST endpoint.

    A better long-term solution is making sure puppetserver is setup to
    point to puppet as the host instead of an FQDN.
    "

    With the PUPPETSERVER_HOSTNAME value set on the puppetserver
    container, both certname and server are set to puppet.internal,
    inside of puppet.conf, preventing a need to inject a domain name as
    was done previously.

    This is necessary because of a discrepancy in how Facter 3 behaves vs
    Facter 2, which creates a mismatch between how the host cert is
    initially generated (using Facter 3) and how puppetserver ca
    finds the files on disk (using Facter 2), that setting
    PUPPETSERVER_HOSTNAME will explicitly work around.

    Specifically, Facter 2 may return a different Facter.value('domain')
    than calling facter domain using Facter 3 at the command line.
    Such is the case inside the puppet network, where Facter 2 returns
    ops.puppetlabs.net while Facter 3 returns delivery.puppetlabs.net

    Without explicitly setting PUPPETSERVER_HOSTNAME, this makes cert
    files on disk get written as *.delivery.puppetlabs.net, yet the
    puppetserver ca application looks for the client certs on disk as
    *.ops.puppetlabs.net, which causes puppetserver ca to fail.

  • Facter 2 should not be included in the puppetserver packages, and
    changes have been made to packaging for future releases, which may
    remove the need for the above.

  • This PR is also made possible by switching over to using the Ubuntu
    based container from the Alpine container (performed in a prior
    commit), due to DNS resolution problems with Alpine inside LCOW:

    LCOW: Intermittent DNS resolution failures with Alpine containers moby/libnetwork#2371
    Intermittent DNS failures when running Alpine containers in user-defined docker-compose network microsoft/opengcs#303

  • Another avenue that was investigated to resolve the DNS problem in
    Alpine was to feed host:ip mappings in through --add-host, but it
    turns out that Windows doesn't yet support that feature per

    Adding hosts for windows containers (--add-host, extra_hosts) does not work docker/for-win#1455

  • Finally, these changes are also made in preparation of switching the
    pupperware-commercial repo over to a private builder

  • Add a spec helper to get the IP of a container (currently unused)

  • Modify compose to use either the ENV variable DOMAIN supplied
    by an end user or to fall back to internal when its left
    unspecified.

@Iristyle Iristyle force-pushed the remove-domain branch 3 times, most recently from ae682f5 to f472775 Compare April 26, 2019 17:41
@nwolfe
Copy link
Contributor

nwolfe commented Apr 26, 2019

@Iristyle Heads up we have those shell scripts in bin/ that do things like docker-compose exec puppet ...

@Iristyle Iristyle changed the title (maint) Add puppetserver alias puppet.master (maint) Add network aliases of .local for containers Apr 30, 2019
@nwolfe
Copy link
Contributor

nwolfe commented Apr 30, 2019

Do we actually need the network aliases if the alias is the same value as the hostname?

@Iristyle
Copy link
Contributor Author

Iristyle commented Apr 30, 2019

Yup, it's a bit counterintuitive, but hostname has no effect on DNS whatsoever... it only changes how the host can refer to itself (which sometimes is used by our scripts and is sometimes used by Facter). The network aliases are added to the Docker DNS resolver so that other hosts can hit the containers by those names.

@Iristyle Iristyle force-pushed the remove-domain branch 15 times, most recently from 90a75c5 to 7147e93 Compare May 1, 2019 22:55
@Iristyle
Copy link
Contributor Author

Iristyle commented May 2, 2019

Also waiting on puppetlabs/puppetdb#2928 to get merged / released so that the healthcheck will query the appropriate DNS name for Postgres

@Iristyle Iristyle force-pushed the remove-domain branch 4 times, most recently from e8c3e63 to bfb29b5 Compare May 3, 2019 00:14
@Iristyle Iristyle changed the title (maint) Add network aliases of .local for containers (maint) Add network aliases of .internal for containers May 3, 2019
@Iristyle Iristyle requested a review from a team as a code owner May 3, 2019 22:19
@Iristyle Iristyle force-pushed the remove-domain branch 5 times, most recently from 3b842e6 to 545aad4 Compare May 6, 2019 03:36
@Iristyle
Copy link
Contributor Author

Iristyle commented May 6, 2019

a67132b demonstrates this still works on the internal pool (CI run at https://dev.azure.com/PuppetInc/pupperware/_build/results?buildId=2462)... removed that commit for final PR

 - Remove the domain introspection / setting of AZURE_DOMAIN env var
   as this does not work as originally thought.

   Instead, hardcode the DNS suffix `.internal` to each service in the
   compose stack, and make sure that `dns_search` for `internal` will
   use the Docker DNS resolver when dealing with these hosts. Note that
   these compose file settings only affect the configuration of the
   DNS resolver, *not* resolv.conf. This is different from the
   docker run behavior, which *does* modify resolv.conf. Also note,
   config file locations vary depending on whether or not systemd is
   running in the container.

   It's not "safe" to refer to services in the cluster by only their
   short service names like `puppet`, `puppetdb` or `postgres` as they
   can conflict with hosts on the external network with these names
   when `resolv.conf` appends DNS search suffixes.

   When docker compose creates the user defined network, it copies the
   DNS settings from the host to the `resolv.conf` in each of the
   containers. This often takes search domains from the outside network
   and applies them to containers.

   When network resolutions happen, any default search suffix will be
   applied to short names when the dns option for ndots is not set to 0.
   So for instance, given a `resolv.conf` that contains:

   search delivery.puppetlabs.net

   A DNS request for `puppet` becomes `puppet.delivery.puppetlabs.net`
   which will fail to resolve in the Docker DNS resolver, then be sent
   to the next DNS server in the `nameserver` list, which may resolve it
   to a different host in the external network. This behaves this way
   because `resolv.conf` also sets secondary DNS servers from the host.

   While it is possible to try and service requests for an external
   domain like `delivery.puppetlabs.net` with the embedded Docker DNS
   resolver, it's better to instead choose a domain suffix to use inside
   the cluster.

   There are some good details on how various network types configure:
   docker/for-linux#488 (comment)

 - Note that the .internal domain is typically not recommended for
   production given the only IANA reserved domains are .example, .test,
   .invalid or .localhost. However, given the DNS resolver is set to
   own the resolution of .internal, this is a compromise.

   In production its recommended to use a subdomain of a domain that
   you own, but that's not yet configurable in this compose file. A
   future commit will make this configurable.

 - Another workaround for this problem would be to set the ndots option
   in resolv.conf to 0 per the documentation at
   http://man7.org/linux/man-pages/man5/resolv.conf.5.html

   However that can't be done for two reasons:

   - docker-compose schema doesn't actually support setting DNS options
     docker/cli#1557

   - k8s sets ndots to 5 by default, so we don't want to be at odds

 - A further, but implausible workaround would be to modify the host DNS
   settings to remove any search suffixes.

 - The original FQDN change being reverted in this commit was introduced
   in 2549f19

   "
   Lastly, the Windows specific docker-compose.windows.yml sets up a
   custom alias in the "default" network so that an extra DNS name for
   puppetserver can be set based on the FQDN that Facter determines.
   Without this additional DNS reservation, the `puppetserver ca`
   command will be unable to connect to the REST endpoint.

   A better long-term solution is making sure puppetserver is setup to
   point to `puppet` as the host instead of an FQDN.
   "

   With the PUPPETSERVER_HOSTNAME value set on the puppetserver
   container, both certname and server are set to puppet.internal,
   inside of puppet.conf, preventing a need to inject a domain name as
   was done previously.

   This is necessary because of a discrepancy in how Facter 3 behaves vs
   Facter 2, which creates a mismatch between how the host cert is
   initially generated (using Facter 3) and how `puppetserver ca`
   finds the files on disk (using Facter 2), that setting
   PUPPETSERVER_HOSTNAME will explicitly work around.

   Specifically, Facter 2 may return a different Facter.value('domain')
   than calling `facter domain` using Facter 3 at the command line.
   Such is the case inside the puppet network, where Facter 2 returns
   `ops.puppetlabs.net` while Facter 3 returns `delivery.puppetlabs.net`

	 Without explicitly setting PUPPETSERVER_HOSTNAME, this makes cert
   files on disk get written as *.delivery.puppetlabs.net, yet the
   `puppetserver ca` application looks for the client certs on disk as
   *.ops.puppetlabs.net, which causes `puppetserver ca` to fail.

 - Facter 2 should not be included in the puppetserver packages, and
   changes have been made to packaging for future releases, which may
   remove the need for the above.

 - This PR is also made possible by switching over to using the Ubuntu
   based container from the Alpine container (performed in a prior
   commit), due to DNS resolution problems with Alpine inside LCOW:

   moby/libnetwork#2371
   microsoft/opengcs#303

 - Another avenue that was investigated to resolve the DNS problem in
   Alpine was to feed host:ip mappings in through --add-host, but it
   turns out that Windows doesn't yet support that feature per

   docker/for-win#1455

 - Finally, these changes are also made in preparation of switching the
   pupperware-commercial repo over to a private builder

 - Additionally update k8s / Bolt specs to be consistent with updated
   naming
Iristyle added 2 commits May 6, 2019 07:23
 - While this is not used anywhere (yet?), make it a helper because
   it took a little bit of effort to figure out how to extract this
   information
 - Modify compose to use either the ENV variable DOMAIN supplied
   by an end user or to fall back to `internal` when its left
   unspecified.
@Iristyle Iristyle merged commit 13a8c0f into puppetlabs:master May 6, 2019
@Iristyle Iristyle deleted the remove-domain branch May 6, 2019 14:58
@@ -44,6 +44,7 @@ spec:
app: pupperware
svc: postgres
spec:
hostname: postgres.internal
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @jarretlavallee discovered, k8s doesn't like hostnames with a . in them, so we're going to have to put up another PR to address the subdomain configuration we're striving for across the board.

@@ -47,11 +47,15 @@ spec:
app: pupperware
svc: puppetdb
spec:
hostname: puppetdb
hostname: puppetdb.internal
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hostname with . needs to be changed

@@ -71,15 +71,19 @@ spec:
app: pupperware
svc: puppet
spec:
hostname: puppet
hostname: puppet.internal
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hostname with . needs to be changed

Iristyle added a commit that referenced this pull request Apr 9, 2021
(maint) Add network aliases of .internal for containers
Iristyle added a commit that referenced this pull request Apr 9, 2021
(maint) Document additional container ENV vars
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants