Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Maximum Pods ENIConfig aware #331

Closed
taylorb-syd opened this issue Feb 23, 2019 · 13 comments
Closed

Make Maximum Pods ENIConfig aware #331

taylorb-syd opened this issue Feb 23, 2019 · 13 comments
Assignees
Labels
2.x CNI plugin Features and issues to address in 2.x CNI plugin enhancement

Comments

@taylorb-syd
Copy link
Contributor

Summary

If you specify an ENIConfig which differs from the primary ENI configuration of the instance on startup, the plugin correctly does not allocate IP addresses on the primary ENI, however it the Maximum Number of Pods assignable to the worker node is still derived based upon the assumption that all secondary IP addresses can be consumed. This results in pods getting stuck in ContainerCreating status.

We should dynamically adjust the maximum number of pods when using an ENIConfig to reflect the maximum number of IPs that can be consuming that align with the ENI Config.

Reproduction Steps

  1. Configure instances to use an ENIConfig in a different subnet from the primary ENI.
  2. Start a single t2.medium or similar instance (must have a healthy number of secondary IP addresses, which t2.medium has 3 x 6).
  3. Start a deployment of a basic pod (e.g. nginx) with a large number of pods (e.g. 200)

Observe that while most pods stay in Pending one ENI's worth of pods will be stuck in ContainerCreating. In the case of a t2.medium, this is 5:

▶ kubectl get pod -o wide | grep nginx-deployment | grep ContainerCreating | wc -l
       5

Looking at these pods closely we can see they are stuck looping over the following state:

  Warning  FailedCreatePodSandBox  74s (x2710 over 55m)    kubelet, ip-10-100-0-228.ap-southeast-2.compute.internal  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "54f42a6a4ce9184111402ed1bbe2b97390abb1f8f70f1b566afca1f688d26727" network for pod "nginx-deployment-67594d6bf6-6zcs9": NetworkPlugin cni failed to set up pod "nginx-deployment-67594d6bf6-6zcs9_default" network: add cmd: failed to assign an IP address to container
@taylorb-syd
Copy link
Contributor Author

Going to work on a PR to see if I can modify this behaviour next week. Will link here if I come up with a solution.

@mogren
Copy link
Contributor

mogren commented Feb 26, 2019

Thanks @taylorb-syd, I'm trying to catch up a bit on the issues again.

@taylorb-syd
Copy link
Contributor Author

Okay, digging a little deeper, the maximum pod setting is actually set when kubelet starts. While making the CNI plugin aware of the max IP addresses is desirable, it is lower priority than correctly setting the maximum pod setting. I am therefore focusing my attention on modifying the bootstrap script. I will link issues and PRs as/when I create them.

@cnelson
Copy link

cnelson commented Feb 28, 2019

We should dynamically adjust the maximum number of pods when using an ENIConfig to reflect the maximum number of IPs that can be consuming that align with the ENI Config.

Do you plan to handle pods with hostNetwork: true in this effort? It's not critical, but I hate to see nodes underutilized because a bunch of hostNetwork daemonset containers are counted against max pods even though they don't use an additional IP.

@taylorb-syd
Copy link
Contributor Author

@cnelson Unfortunately there is no way to dynamically set the number of max pods in kubelet by the looks of it. It can only be set on startup. Therefore it's not possible to say "don't count this pod towards the max pod limits".

However, that being said, if you know ahead of time how many hostNetwork: true daemon sets you are going to use on a given instance, we might as well add an option to adjust the max pods. I'll add this into my script modifications.

@cnelson
Copy link

cnelson commented Mar 2, 2019

@taylorb-syd for never versions of EKS on 1.11+ this is doable I think:

https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/
https://github.com/kubernetes/kubernetes/blob/release-1.11/pkg/kubelet/apis/kubeletconfig/v1beta1/types.go#L409

@frimik
Copy link

frimik commented Apr 4, 2019

In the basic solution, it's simple. If we properly have access to a true export of this1 then we can do:

maxPods = (numInterfaces - 1) * maxIpv4PerInterface + daemonSetPrediction

... in the bootstrap. Where daemonSetPrediction is that operator-provided educated guess.

I'd be happy for that as an initial break-fix, but seems not very good in the long run.

@taylorb-syd curious if you did anything clever yet?

Footnotes

  1. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI

@taylorb-syd
Copy link
Contributor Author

@frimik Happy for you to put attention into this, I have been swamped last month and couldn't put any energy/effort into this.

@kushwiz
Copy link

kushwiz commented Apr 23, 2019

In case of custom CNI Networking,

maxPods = (numInterfaces - 1) * (maxIpv4PerInterface - 1) + 2 
  • Subtracting 1 from numInterfaces - One interface goes for the node's network.
  • Subtracting 1 from maxIpv4PerInterface - One IP from the custom CNI subnet goes to the worker node.
  • Adding 2 as a constant to account for aws-node and kube-proxy as both use hostNetwork.

@kushwiz
Copy link

kushwiz commented Apr 23, 2019

I use this script to generate max-pods.json to paste into my worker node's user-data

import requests
import json

from bs4 import BeautifulSoup

response = requests.get("https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI")

parsed_html = BeautifulSoup(response.text, features="html.parser")

table = parsed_html.find('table', attrs={'id': 'w299aac23c19c19b5'})

rows = table.find_all("tr")

instance_max_pods = {}

for row in rows:
	cells = row.find_all("td")
	if len(cells) < 1:
		continue
	# (# of ENI - 1) * (Max IPs per ENI - 1) + 2
	instance_max_pods[cells[0].text.strip()] = (int(cells[1].text) - 1) * (int(cells[2].text) - 1) + 2 
	# Add two for aws-node and kube-proxy hostNetwork pods
	# IPs per ENI - 1 = One IP address is allocated to the Host ENI itself.
print(json.dumps(instance_max_pods))

@mogren
Copy link
Contributor

mogren commented Mar 11, 2020

Related issues: #527

awslabs/amazon-eks-ami#375

@infa-ddeore
Copy link

w299aac23c19c19b5

table ID is changed to w295aac21c13c15b5 now, this needs to be handled as well

@jayanthvn
Copy link
Contributor

New helper script is added for CNI v1.9.0 onwards - https://github.com/awslabs/amazon-eks-ami/blob/master/files/max-pods-calculator.sh to help compute the max pods. This takes into consideration CNI version, custom networking, max eni and prefix delegation is configured. Also with MNG this script will be triggered on AMI startup to configure kubelet maxPods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x CNI plugin Features and issues to address in 2.x CNI plugin enhancement
Projects
None yet
Development

No branches or pull requests

7 participants