Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI displays reports host_network configuration and not fingerprint #11671

Open
nvx opened this issue Dec 13, 2021 · 3 comments
Open

CLI displays reports host_network configuration and not fingerprint #11671

nvx opened this issue Dec 13, 2021 · 3 comments

Comments

@nvx
Copy link
Contributor

nvx commented Dec 13, 2021

Nomad version

Nomad v1.2.2 (78b8c17)

Operating system and Environment details

Linux x64 (mix of RHEL and Alpine)

Issue

The documented behaviour at https://www.nomadproject.io/docs/configuration/client#host_network-stanza doesn't appear to be what happens in practice:

Specifies a cidr block of addresses to match against. If an address is found on the node that is contained by this cidr block, the host network will be registered with it.

Reproduction steps

On a host with eth0 having an IP of 10.1.2.52 and no other interfaces (except the default loopback/etc)

With the following Nomad client config:

client {
  host_network "default" {
    interface = "eth0"
  }
  host_network "test1" {
    cidr = "10.1.1.0/24"
  }
  host_network "test2" {
    cidr = "10.1.2.0/24"
  }
  host_network "test3" {
    cidr = "10.1.3.0/24"
  }
}

Doing a node status -verbose <node-id> results in the following output:

Host Networks
Name      CIDR           Interface  ReservedPorts
default   <none>         eth0       <none>
test1     10.1.1.0/24    <none>     <none>
test2     10.1.2.0/24    <none>     <none>
test3     10.1.3.0/24    <none>     <none>

Expected Result

I would have expected only "default" and "test2" host networks to appear.

Actual Result

All networks from the config file appear.

@nvx nvx added the type/bug label Dec 13, 2021
@nvx
Copy link
Contributor Author

nvx commented Dec 13, 2021

Also interesting ReservedPorts shows even if there are global reservations set:

client {
  reserved {
    reserved_ports = "22,4646,4647,4648,8301,8500,8600"
  }
}

On one hand the reserved ports aren't network specific so that makes sense not showing, on the other hand the global reserved ports don't appear to show anywhere else instead either. I haven't done testing to see if the reservations are actually working or not.

I can open a separate issue for this if it's not just me being blind and missing it in the output.

@tgross
Copy link
Member

tgross commented Dec 14, 2021

Hi @nvx! Thanks for opening this issue!

I was able to reproduce fairly easily using our Vagrant environment. With the following configuration on a host that has the IP 10.0.2.15/24 on eth0:

client {
  enabled    = true

  host_network "specific_ip" {
    cidr           = "10.0.2.15/24"
    reserved_ports = "22"
  }

  host_network "should_not_match" {
    cidr           = "10.199.0.0/24"
  }
  host_network "should_match" {
    cidr           = "10.0.2.0/24"
  }
}

I'm getting the following, none of which is correct:

Host Networks
Name              CIDR           Interface  ReservedPorts
should_match      10.0.2.0/24    <none>     <none>
should_not_match  10.199.0.0/24  <none>     <none>
specific_ip       10.0.2.15/24   <none>     <none>

I tried removing the less specific matching (ex. remove the should_match block) just to see if they were interfering with each other, and that doesn't seem to be the case.

However, when I try to actually run a job that uses one of the invalid host networks, the scheduler rejects it!

jobspec
job "example" {
  datacenters = ["dc1"]

  group "web" {

    network {
      mode = "bridge"
      port "www" {
        to = 8001
        host_network = "should_not_match"
      }
    }

    task "http" {

      driver = "docker"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-v", "-f", "-p", "8001", "-h", "/local"]
        ports   = ["www"]
      }

      template {
        data        = "<html>hello, world</html>"
        destination = "local/index.html"
      }

      resources {
        cpu    = 128
        memory = 128
      }

    }
  }
}
$ nomad job run ./example.nomad
==> 2021-12-14T14:28:52Z: Monitoring evaluation "dd62b748"
    2021-12-14T14:28:52Z: Evaluation triggered by job "example"
==> 2021-12-14T14:28:53Z: Monitoring evaluation "dd62b748"
    2021-12-14T14:28:53Z: Evaluation within deployment: "cf328057"
    2021-12-14T14:28:53Z: Evaluation status changed: "pending" -> "complete"
==> 2021-12-14T14:28:53Z: Evaluation "dd62b748" finished with status "complete" but failed to place all allocations:
    2021-12-14T14:28:53Z: Task Group "web" (failed to place 1 allocation):
      * Class "foo": 1 nodes excluded by filter
      * Constraint "missing host network \"should_not_match\" for port \"www\"
...

This leads me to believe the CLI may be returning the client configuration here and not the actual fingerprint. So then I looked at the API and it looks like that's exactly it. Compare the following two blocks:

$ curl -s "localhost:4646/v1/node/30bd1930-302b-0691-62e4-fc1708be3123" | jq .HostNetworks
{
  "should_not_match": {
    "CIDR": "10.199.0.0/24",
    "Interface": "",
    "Name": "should_not_match",
    "ReservedPorts": ""
  },
  "specific_ip": {
    "CIDR": "10.0.2.15/24",
    "Interface": "",
    "Name": "specific_ip",
    "ReservedPorts": "22"
  }
}

$ curl -s "localhost:4646/v1/node/30bd1930-302b-0691-62e4-fc1708be3123" | jq '.NodeResources.NodeNetworks'
[
  {
    "Addresses": null,
    "Device": "",
    "MacAddress": "",
    "Mode": "bridge",
    "Speed": 0
  },
  {
    "Addresses": [
      {
        "Address": "10.0.2.15",
        "Alias": "specific_ip",
        "Family": "ipv4",
        "Gateway": "",
        "ReservedPorts": ""
      }
    ],
    "Device": "eth0",
    "MacAddress": "08:00:27:3d:a1:04",
    "Mode": "host",
    "Speed": 1000
  }
]

So there's a bug here in how we're presenting the results from the API in the CLI, but fortunately the actual fingerprint that's reaching the server reflects the fingerprint.

But there's also a bug though in that the NodeNetworks.Addresses doesn't include the value from ReservedPorts. That looks like it's a case of #9492 (and is possibly the root cause of #9506).

I'm going to keep this issue open to cover the CLI reporting correctly, and we'll deal with the reserved port issue in that issue. I've renamed the issue title for clarity and I'll mark this for roadmapping.

@tgross tgross changed the title host_network stanza registers all provided networks against a host, even if the CIDRs don't match CLI displays reports host_network configuration and not fingerprint Dec 14, 2021
@tgross tgross removed their assignment Dec 14, 2021
@nvx
Copy link
Contributor Author

nvx commented Dec 17, 2021

Ah nice catch! A CLI bug is a lot less concerning than the fingerprinting being wrong.

Having a look at my nodes, the node info .NodeResources.NodeNetworks json path shows the expected info which confirms your findings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs Roadmapping
Development

No branches or pull requests

2 participants