Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scaling managed nodegroups to 0 #3793

Closed
aclevername opened this issue Jun 1, 2021 · 12 comments
Closed

scaling managed nodegroups to 0 #3793

aclevername opened this issue Jun 1, 2021 · 12 comments

Comments

@aclevername
Copy link
Contributor

aclevername commented Jun 1, 2021

does eksctl let you create and scale managed nodegroups to 0? Lets find out and add support if we don't

@aclevername aclevername added the kind/feature New feature or request label Jun 1, 2021
@rverma-dev
Copy link

I think eks doesn't let you create a managed nodegroup with min as 0. However i read there is a workaround by manually updating asg size as zero. aws/containers-roadmap#724

@Callisto13 Callisto13 added good first issue Good for newcomers needs-investigation and removed kind/feature New feature or request labels Jun 9, 2021
@aclevername
Copy link
Contributor Author

@nikimanoledaki
Copy link
Contributor

nikimanoledaki commented Jul 6, 2021

Quick update on this:
According to the comment linked to by @aclevername, creating a managed nodegroup with the following fields should work:

managedNodeGroups:
  - name: ng-scale
    desiredCapacity: 0 // setting desiredCapacity and minSize to zero
    minSize: 0
    maxSize: 1 // EKS API requires maxSize > 0

The eksctl create command does not return any error when this config is passed to it, but it doesn't create the nodegroup either 🤔

$ eksctl create nodegroup --config-file eks.yaml
2021-07-06 13:41:13 [ℹ]  eksctl version 0.55.0
2021-07-06 13:41:13 [ℹ]  using region us-west-2
2021-07-06 13:41:14 [ℹ]  will use version 1.19 for new nodegroup(s) based on control plane version
2021-07-06 13:41:19 [ℹ]  nodegroup "ng-scale" will use "" [AmazonLinux2/1.19]
2021-07-06 13:41:21 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-1 is DELETE_FAILED
2021-07-06 13:41:21 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-2 is DELETE_FAILED
2021-07-06 13:41:21 [ℹ]  1 existing nodegroup(s) (ng-scale) will be excluded
2021-07-06 13:41:22 [ℹ]  2 sequential tasks: { fix cluster compatibility, no tasks }
2021-07-06 13:41:22 [ℹ]  checking cluster stack for missing resources
2021-07-06 13:41:23 [ℹ]  cluster stack has all required resources
2021-07-06 13:41:23 [ℹ]  no tasks
2021-07-06 13:41:23 [✔]  created 0 nodegroup(s) in cluster "nm-test"
2021-07-06 13:41:23 [✔]  created 0 managed nodegroup(s) in cluster "nm-test"
2021-07-06 13:41:24 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-1 is DELETE_FAILED
2021-07-06 13:41:24 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-2 is DELETE_FAILED
2021-07-06 13:41:24 [ℹ]  checking security group configuration for all nodegroups
2021-07-06 13:41:24 [ℹ]  all nodegroups have up-to-date configuration

Notice the created 0 nodegroup(s) in cluster "nm-test" message.

get lists zero nodegroups.

@aclevername
Copy link
Contributor Author

Quick update on this:
According to the comment linked to by @aclevername, creating a managed nodegroup with the following fields should work:

managedNodeGroups:
  - name: ng-scale
    desiredCapacity: 0 // setting desiredCapacity and minSize to zero
    minSize: 0
    maxSize: 1 // EKS API requires maxSize > 0

The eksctl create command does not return any error when this config is passed to it, but it doesn't create the nodegroup either

$ eksctl create nodegroup --config-file eks.yaml
2021-07-06 13:41:13 [ℹ]  eksctl version 0.55.0
2021-07-06 13:41:13 [ℹ]  using region us-west-2
2021-07-06 13:41:14 [ℹ]  will use version 1.19 for new nodegroup(s) based on control plane version
2021-07-06 13:41:19 [ℹ]  nodegroup "ng-scale" will use "" [AmazonLinux2/1.19]
2021-07-06 13:41:21 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-1 is DELETE_FAILED
2021-07-06 13:41:21 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-2 is DELETE_FAILED
2021-07-06 13:41:21 [ℹ]  1 existing nodegroup(s) (ng-scale) will be excluded
2021-07-06 13:41:22 [ℹ]  2 sequential tasks: { fix cluster compatibility, no tasks }
2021-07-06 13:41:22 [ℹ]  checking cluster stack for missing resources
2021-07-06 13:41:23 [ℹ]  cluster stack has all required resources
2021-07-06 13:41:23 [ℹ]  no tasks
2021-07-06 13:41:23 [✔]  created 0 nodegroup(s) in cluster "nm-test"
2021-07-06 13:41:23 [✔]  created 0 managed nodegroup(s) in cluster "nm-test"
2021-07-06 13:41:24 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-1 is DELETE_FAILED
2021-07-06 13:41:24 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-2 is DELETE_FAILED
2021-07-06 13:41:24 [ℹ]  checking security group configuration for all nodegroups
2021-07-06 13:41:24 [ℹ]  all nodegroups have up-to-date configuration

Notice the created 0 nodegroup(s) in cluster "nm-test" message.

get lists zero nodegroups.

interesting. I wonder if its an unsupported value in cloudformation, but valid in the EKS api?

Looking at the docs they both say max-size must be >=1

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-eks-nodegroup-scalingconfig.html
https://docs.aws.amazon.com/eks/latest/APIReference/API_NodegroupScalingConfig.html

@nikimanoledaki
Copy link
Contributor

nikimanoledaki commented Jul 6, 2021

Definitely, there was an error when I didn't set maxSize, so I set it to 1 in the config!

What we want to do though is set minSize and desiredCapacity to 0 at creation, but eksctl doesn't currently let us do that (no error, it just doesn't create the nodegroup at all)

I'll jump in the code of create nodegroup for more info :shipit:

@nikimanoledaki
Copy link
Contributor

nikimanoledaki commented Jul 6, 2021

Sorry, I must have done something wrong the last time because I tried again and it worked!

The config file used was the following:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: testnm
  region: us-west-2

managedNodeGroups:
  - name: mng-ng
    desiredCapacity: 0
    minSize: 0
    maxSize: 1

The nodegroup was created successfully:

➜  ~ eksctl create nodegroup --config-file eks.yaml
2021-07-06 14:25:11 [ℹ]  eksctl version 0.55.0
2021-07-06 14:25:11 [ℹ]  using region us-west-2
2021-07-06 14:25:12 [ℹ]  will use version 1.19 for new nodegroup(s) based on control plane version
2021-07-06 14:25:17 [ℹ]  nodegroup "ng-scale-2" will use "" [AmazonLinux2/1.19]
2021-07-06 14:25:20 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-1 is DELETE_FAILED
2021-07-06 14:25:20 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-2 is DELETE_FAILED
2021-07-06 14:25:20 [ℹ]  1 existing nodegroup(s) (ng-scale) will be excluded
2021-07-06 14:25:20 [ℹ]  1 nodegroup (ng-scale-2) was included (based on the include/exclude rules)
2021-07-06 14:25:20 [ℹ]  will create a CloudFormation stack for each of 1 managed nodegroups in cluster "nm-te
st"
2021-07-06 14:25:20 [ℹ]  2 sequential tasks: { fix cluster compatibility, 1 task: { 1 task: { create managed n
odegroup "ng-scale-2" } } }
2021-07-06 14:25:20 [ℹ]  checking cluster stack for missing resources
2021-07-06 14:25:22 [ℹ]  cluster stack has all required resources
2021-07-06 14:25:22 [ℹ]  building managed nodegroup stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:25:22 [ℹ]  deploying stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:25:22 [ℹ]  waiting for CloudFormation stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:25:38 [ℹ]  waiting for CloudFormation stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:25:56 [ℹ]  waiting for CloudFormation stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:26:16 [ℹ]  waiting for CloudFormation stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:26:34 [ℹ]  waiting for CloudFormation stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:26:55 [ℹ]  waiting for CloudFormation stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:27:15 [ℹ]  waiting for CloudFormation stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:27:34 [ℹ]  waiting for CloudFormation stack "eksctl-nm-test-nodegroup-ng-scale-2"
2021-07-06 14:27:35 [ℹ]  no tasks
2021-07-06 14:27:35 [✔]  created 0 nodegroup(s) in cluster "nm-test"
2021-07-06 14:27:35 [✔]  created 1 managed nodegroup(s) in cluster "nm-test"
2021-07-06 14:27:36 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-1 is DELETE_FAILED
2021-07-06 14:27:36 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-2 is DELETE_FAILED
2021-07-06 14:27:37 [ℹ]  checking security group configuration for all nodegroups
2021-07-06 14:27:37 [ℹ]  all nodegroups have up-to-date configuration

getting this nodegroup confirms that it was created with MinSize: 0:

➜  ~ eksctl get ng --cluster nm-test -o yaml
2021-07-06 14:28:22 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-1 is DELETE_FAILED
2021-07-06 14:28:22 [!]  stack's status of nodegroup named eksctl-nm-test-nodegroup-ng-2 is DELETE_FAILED
- AutoScalingGroupName: eks-dabd3dff-bab5-b6d8-a76e-e1b0640601a1
  Cluster: nm-test
  CreationTime: "2021-07-06T13:26:19.578Z"
  DesiredCapacity: 0
  ImageID: AL2_x86_64
  InstanceType: m5.large
  MaxSize: 1
  MinSize: 0
  Name: ng-scale-2
  NodeInstanceRoleARN: arn:aws:iam::083751696308:role/eksctl-nm-test-nodegroup-ng-scale-NodeInstanceRole-1PQJH
PVWIFXB0
 StackName: eksctl-nm-test-nodegroup-ng-scale-2
 Status: ACTIVE
 Version: "1.19"

Closing this issue since it is possible to create a managed nodegroup with minSize: 0 and desiredSize: 0 nodes at creation with eksctl! 🎉

@sammort
Copy link

sammort commented Jul 6, 2021

@nikimanoledaki as per the discussion in aws/containers-roadmap, the auto-scaling-group must be tagged with the labels and taints that the managed nodegroup is tagged with, otherwise cluster-autoscaler is unable to scale-up from 0.

So it is currently possible to create a managed nodegroup with minSize: 0 and desiredSize: 0 with eksctl, I don't believe the nodegroup would ever scale-up without the tags being applied to the ASG.

I'm using this script posted later in the same issue linked to by @aclevername, to achieve what is being requested:

  1. Creating the 0 min/desired size nodegroup with eksctl
  2. Add the k8s.io/cluster-autoscaler/node-template/label/<LABEL> Tag to the nodegroup manually
  3. Using the linked script to copy that tag to the ASG

It would be great if eksctl handled 2 and 3 for us.

@nikimanoledaki
Copy link
Contributor

Thank you for pointing that out @sammort!

We should definitely track this in case we need to make changes to support this better. Reopening this issue and will investigate a bit more.

@nikimanoledaki nikimanoledaki reopened this Jul 6, 2021
@nikimanoledaki nikimanoledaki added area/managed-nodegroup EKS Managed Nodegroups and removed good first issue Good for newcomers labels Jul 6, 2021
@nikimanoledaki
Copy link
Contributor

Hi again @sammort, I believe the following should be helpful:

Does this solve 2 and/or 3?

@sammort
Copy link

sammort commented Jul 7, 2021

@nikimanoledaki that's great and certainly seems that it would create a managed nodegroup that would scale from 0, I'll test it out.

Would it be possible for the ASG tags to be created automatically if the labels and taints have been defined? Seems that we are repeating ourselves in the config by having to add the following section as per the doc link

tags:
    k8s.io/cluster-autoscaler/node-template/label/my-cool-label: pizza
    k8s.io/cluster-autoscaler/node-template/taint/feaster: "true:NoSchedule"

@nikimanoledaki
Copy link
Contributor

Good idea @sammort, that sounds reasonable! I just created a feature request that addresses this so that someone can pick it up soon (it's also in the team's backlog). I'll close this issue for now.

Thanks so much for your contribution! 😊 🙌

@sammort
Copy link

sammort commented Jul 9, 2021

Thanks for creating the feature request and thanks to you and the team for the great work 👍🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants