Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Autoscaler Addon: Support scaling up from 0 #1481

Closed
cPu1 opened this issue Oct 24, 2019 · 15 comments
Closed

Cluster Autoscaler Addon: Support scaling up from 0 #1481

cPu1 opened this issue Oct 24, 2019 · 15 comments

Comments

@cPu1
Copy link
Contributor

cPu1 commented Oct 24, 2019

When performing a scale out, Cluster Autoscaler uses a live node in the ASG to build a template to launch new nodes. These nodes can have Kubernetes labels and taints defined.
When there are no live nodes, however, Cluster Autoscaler has to rely on the tags defined on the ASG as node-template keys to build a template for launching the nodes (see https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/aws#scaling-a-node-group-to-0). These node-template keys must contain the Kubernetes labels and taints added to the corresponding node group in order for Cluster Autoscaler to add them to new nodes.

Problem:
eksctl only adds the tags required by Cluster Autoscaler for auto-discovery, but not the node-template keys.

@cPu1 cPu1 self-assigned this Oct 24, 2019
@scottyhq
Copy link

@cPu1 - see #1066

@cPu1
Copy link
Contributor Author

cPu1 commented Oct 26, 2019

@scottyhq thanks, I did see that before opening this issue. That issue is closed and suggests manually adding the tags to node groups. This issue is specifically for having eksctl add the node-template tags when the Cluster Autoscaler addon is enabled (iam.withAddonPolicies.autoScaler: true)

@whereisaaron
Copy link

@scottyhq given eksctl already knows the labels and taints for a node group, then it can automatically generate the matching tags for the ASG. This saves labor over manual tagging and reduces the opportunity for mistakes and drift.

@cPu1 I would suggest not linking this to the iam.withAddonPolicies.autoScaler: true option, since that is the old instance role approach, and people using the auto scaler with IAM Service Accounts will not be enabling that IAM option.

I'd suggest this is a node group level option, since the auto scaler(s) may be in use on some node groups and not other, maybe nodeGroup.tagForAutoScaler: true?

IAM Service Account Refs:

@cPu1
Copy link
Contributor Author

cPu1 commented Oct 29, 2019

@cPu1 I would suggest not linking this to the iam.withAddonPolicies.autoScaler: true option, since that is the old instance role approach

IAM Roles for Service Accounts (IRSA) is a relatively new feature, so we can't expect all users to have started using it. Moreover, it's not supported on EKS versions below 1.13 and not all tools accessing AWS services have been updated to use the newer AWS SDK that added support for it (by including it in the credential chain provider).

and people using the auto scaler with IAM Service Accounts will not be enabling that IAM option.

That's right, this feature is not for users of IRSA.

I'd suggest this is a node group level option, since the auto scaler(s) may be in use on some node groups and not other, maybe nodeGroup.tagForAutoScaler: true?

This is a node group level option and isn't a new addition to the schema. nodeGroups[*].iam.withAddonPolicies.autoScaler already exists, this issue is for also adding the node-template tags when the autoScaler option is enabled.

@whereisaaron
Copy link

whereisaaron commented Oct 30, 2019

Eh, there are two separate things:

  1. Adding auto-scaler policy to instance roles
  2. Adding label/taint tags to ASG to enable scale from zero

Making 1 and 2 separate options does not require people to use IRSA!
We can simple enable both options.

  • People not yet using IRSA will want both 1 and 2.
  • People using IRSA still need 2, but do not need or want 1, since that is the requirement IRSA aims to avoid.

Linking 1 and 2 to the same option makes 2 unavailable to IRSA users. Separate options supports both clusters that don’t use IRSA and those that do.

Why are you opposed to two options so both use cases can be supported @cPu1?

@cPu1
Copy link
Contributor Author

cPu1 commented Nov 2, 2019

Ah, I misunderstood your point.

I agree that eksctl should be able to support the CA addon for users of IRSA, but, IMO, not by requiring non-IRSA users to supply an extra option (nodeGroup.tagForAutoScaler) to add the node-template tags. This is an internal detail of CA that users of eksctl need not be aware of, and setting a single option (iam.withAddonPolicies.autoScaler: true in the case of non-IRSA users) should do everything required to have a fully-functional CA. (I should note that, it might have been more appropriate to have the addons configuration outside of the iam field as eksctl adds not only the required IAM policies but also tags).

To support IRSA users, a similar level of support as iam.withAddonPolicies needs to be added by creating IAM roles that have all required policies attached, essentially automating iam.serviceAccounts.

@dougbyrne
Copy link

dougbyrne commented Apr 20, 2020

I don't think it makes sense to tie this tagging to an IAM setting, regardless of your use of IRSA or not. Only the cluster autoscaler itself needs the elevated permissions. That might be granted with IRSA, or it might be granted to a specific nodegroup, or some other way (kiam). The scaling targets do not need any additional IAM permissions, and the best practice would be to not grant this permission where it is not needed.

Edit - Also, I run my CA pod on a nodegroup that doesn't scale. I have min == max so effectively it doesn't scale, but I could see where someone might want to grant the elevated permissions to a particular node group, but not want to have CA enabled on that group.

@TBBle
Copy link
Contributor

TBBle commented May 29, 2020

Since I just raised #2263 to separately request something like nodeGroup.tagForAutoScaler to apply the CA discovery tags to nodes that don't have the CA Instance Role addon, perhaps limit this issue to "whenever the tags are added for the cluster auto-scaler, also add the node-template tags for labels and taints on this NodeGroup".

Although that was the same request as #1066, and it was closed with "just do it manually in the config"...

@TBBle
Copy link
Contributor

TBBle commented Jun 16, 2020

Just to cross the streams, Managed Nodegroups may get the same behaviour, of copying all the taints and labels from a node onto the ASG. In the Managed Nodegroups case, all nodegroups are automatically tagged for the Cluster Autoscaler already, so effectively tagForAutoScaler is permanently true for those.

@martina-if martina-if added priority/backlog Not staffed at the moment. Help wanted. kind/improvement labels Sep 15, 2020
@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label Jan 19, 2021
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 1, 2021

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as completed Apr 1, 2021
@TBBle
Copy link
Contributor

TBBle commented Apr 1, 2021

Aw, dang. I forgot to comment earlier. >_<

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants