-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(eks): Support isolated VPCs #12171
Comments
Allow all our lambda handlers to be provisioned inside the cluster VPC. The `KubectlProvider` handlers were already placed inside the VPC is they could have, the missing was to include the `ClusterHandler`. This is now possible via the `placeClusterHandlerInVpc` (names are welcome) property. Default value remains `false` because if the VPC happens to be isolated (i.e no outbound internet access) this would break the deployment. (See #12171) Closes #9509 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Allow all our lambda handlers to be provisioned inside the cluster VPC. The `KubectlProvider` handlers were already placed inside the VPC is they could have, the missing was to include the `ClusterHandler`. This is now possible via the `placeClusterHandlerInVpc` (names are welcome) property. Default value remains `false` because if the VPC happens to be isolated (i.e no outbound internet access) this would break the deployment. (See aws#12171) Closes aws#9509 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
In my scenario, my "isolated" subnets aren't really isolated from the internet as I use a TGW to route traffic via an egress network. If you try for private and natGateways=0, CDK insists you call them isolated. If you call them isolated, you can't put EKS on them. Is there a workaround to this, or could there be some sort of "I know what I'm doing" override added? |
If they are not actually isolated, you should be able to use them. Are you getting some kind of error? This issue refers only to truly isolated subnets that have no internet access. |
Yes:
I think I just found the correct way to do this, which is leave As I'm thinking about this more... I think my complaint is more properly lodged with the natGateways=0 requires ISOLATED logic... I fear that calling them isolated will lead to someone making a poor assumption about their (lack of) internet access down the road. |
I agree about that. Might be worth opening a separate issue for the
It doesn't really do that. The only logic pertaining to subnets is that we try and select the private subnets from the configured VPC, but we actually treat Would help if you could attach the full stack trace and/or code snippet. |
Sure thing
cdk_eks_stack.py Line 48: self.cluster = eks.Cluster(
self,
"Cluster",
cluster_name=cluster_name,
vpc=vpc,
version=eks.KubernetesVersion.V1_18,
default_capacity=0,
endpoint_access=eks.EndpointAccess.PRIVATE,
masters_role=adminRole,
secrets_encryption_key=secrets_key,
security_group=security_group,
# vpc_subnets=vpc.isolated_subnets,
) |
Ok I understand now. Yeah your solution is appropriate, without the |
Excellent. Thank you for taking a look. |
…ll cluster handler functions (#17200) ## Summary This PR is intended for CDK EKS users who require all traffic to be routed through a proxy. Currently if a user does not allow internet connections to the VPC without going through a proxy, then deploying an EKS cluster will result in a timeout error: ```sh Received response status [FAILED] from custom resource. Message returned: Error: 2021-10-20T14:20:47.028Z d86e3ef4-45ce-4130-988f-c4663f7f8c80 Task timed out after 60.06 seconds ``` Fixes: #12469, SIM D29159517 Related to but does not resolve: `https://github.com/aws/aws-cdk/issues/12171` ## ⚙️ Changes _Expand each list item for additional details._ <details> <summary><strong>Corrected "Cluster Handler" docs to clarify that 2 lambdas are created (<code>onEventHandler</code>, <code>isCompleteHandler</code>)</strong></summary> <br /> Our docs [currently describe the "Cluster Handler" as one Lambda function that interacts with the EKS API](https://docs.aws.amazon.com/cdk/api/latest/docs/aws-eks-readme.html#cluster-handler). However this is not accurate. The "Cluster Handler" actually creates [two Lambdas](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-provider.ts#L69-L96) for the Custom Resource, `onEventHandler` and `isCompleteHandler`, both interact with the AWS API. </details> <details> <summary><strong>Passes the <code>clusterHandlerEnvironment</code> to both Cluster Handler Lambdas</strong></summary> <br /> The `clusterHandlerEnvironment` is the [recommended method](https://docs.aws.amazon.com/cdk/api/latest/docs/aws-eks-readme.html#cluster-handler) of passing a proxy url (i.g. `http_proxy: 'http://my-proxy.com:3128'`) to the Cluster Handler. Currently the `clusterHandlerEnvironment` is only passed to the Cluster Handler's `onEventHandler` Lambda. [The `onEventHandler` was believed to be the only Cluster Handler Lambda that interacts with the AWS EKS API](#12469 (comment)), however this is not entirely true. Both the `onEventHandler` and `isCompleteHandler` call the AWS EKS API. Following the execution process of `isCompleteHandler` when creating an EKS cluster: 1. [`index.isComplete()` (this is the Lambda handler)](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/index.ts#L48) 2. [`common.isComplete()`](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/common.ts#L59) 3. [`cluster.isCreateComplete()`](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts#L56) 4. [`cluster.isActive()`](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts#L196) 5. [Request to EKS API](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts#L198) (results in timeout because proxy is not used) This change allows the user to pass proxy urls as environment variables to **both** Lambdas using `clusterHandlerEnvironment`. </details> <details> <summary><strong>Renames the prop <code>onEventLayer</code> -> <code>proxyAgentLayer</code>, and provides the layer to both Cluster Handler Lambdas</strong></summary> <br /> The proxy-agent layer is now used in both `onEventHandler` and `isCompleteHandler` lambdas in order to support proxy configurations. Because of this change, i've deprecated the original `onEventLayer` and created a new prop `proxyAgentLayer` since we will now be passing this prop into more than just the `onEventHandler` Lambda. The `onEventLayer` prop was introduced [a few weeks ago (sept 24)](#16657) so it should not impact many users (if any). The prop would only be used if the user wishes to bundle the layer themselves with a custom proxy agent. This prop follows the [same user customization we allow with the kubectl handler](https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-eks.Cluster.html#kubectllayer). Another suitable name for this prop could have been `clusterHandlerLayer` but I chose `proxyAgentLayer` because it represents **what** the layer is used for, instead of describing **where** it's used. This also follows the convention of the pre-existing [`kubectlLayer` prop](https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-eks.Cluster.html#kubectllayer). </details> <details> <summary><strong>Adds the EKS cluster prop <code>clusterHandlerSecurityGroup</code></strong></summary> <br /> If a proxy address is provided to the Cluster Handler Lambdas, but the proxy instance is not open to the world, then the dynamic IPs of the Cluster Handler Lambdas will be denied access. To solve this, i've implemented a new Cluster prop `clusterHandlerSecurityGroup`. This `clusterHandlerSecurityGroup` prop will allow the user to pass a Security Group to both Lambda functions and the Custom Resource provider. This is very similar to how we [already allow users to pass Security Groups to the Kubectl Handler](https://github.com/aws/aws-cdk/blob/7f194000697b85deb410ae0d7f7d4ac3c2654bcc/packages/%40aws-cdk/aws-eks/lib/kubectl-provider.ts#L83) </details> ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
…ll cluster handler functions (aws#17200) ## Summary This PR is intended for CDK EKS users who require all traffic to be routed through a proxy. Currently if a user does not allow internet connections to the VPC without going through a proxy, then deploying an EKS cluster will result in a timeout error: ```sh Received response status [FAILED] from custom resource. Message returned: Error: 2021-10-20T14:20:47.028Z d86e3ef4-45ce-4130-988f-c4663f7f8c80 Task timed out after 60.06 seconds ``` Fixes: aws#12469, SIM D29159517 Related to but does not resolve: `https://github.com/aws/aws-cdk/issues/12171` ## ⚙️ Changes _Expand each list item for additional details._ <details> <summary><strong>Corrected "Cluster Handler" docs to clarify that 2 lambdas are created (<code>onEventHandler</code>, <code>isCompleteHandler</code>)</strong></summary> <br /> Our docs [currently describe the "Cluster Handler" as one Lambda function that interacts with the EKS API](https://docs.aws.amazon.com/cdk/api/latest/docs/aws-eks-readme.html#cluster-handler). However this is not accurate. The "Cluster Handler" actually creates [two Lambdas](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-provider.ts#L69-L96) for the Custom Resource, `onEventHandler` and `isCompleteHandler`, both interact with the AWS API. </details> <details> <summary><strong>Passes the <code>clusterHandlerEnvironment</code> to both Cluster Handler Lambdas</strong></summary> <br /> The `clusterHandlerEnvironment` is the [recommended method](https://docs.aws.amazon.com/cdk/api/latest/docs/aws-eks-readme.html#cluster-handler) of passing a proxy url (i.g. `http_proxy: 'http://my-proxy.com:3128'`) to the Cluster Handler. Currently the `clusterHandlerEnvironment` is only passed to the Cluster Handler's `onEventHandler` Lambda. [The `onEventHandler` was believed to be the only Cluster Handler Lambda that interacts with the AWS EKS API](aws#12469 (comment)), however this is not entirely true. Both the `onEventHandler` and `isCompleteHandler` call the AWS EKS API. Following the execution process of `isCompleteHandler` when creating an EKS cluster: 1. [`index.isComplete()` (this is the Lambda handler)](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/index.ts#L48) 2. [`common.isComplete()`](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/common.ts#L59) 3. [`cluster.isCreateComplete()`](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts#L56) 4. [`cluster.isActive()`](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts#L196) 5. [Request to EKS API](https://github.com/aws/aws-cdk/blob/0cabb9f2d2f50c03337cd6f35bf47fc54ada3a21/packages/%40aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts#L198) (results in timeout because proxy is not used) This change allows the user to pass proxy urls as environment variables to **both** Lambdas using `clusterHandlerEnvironment`. </details> <details> <summary><strong>Renames the prop <code>onEventLayer</code> -> <code>proxyAgentLayer</code>, and provides the layer to both Cluster Handler Lambdas</strong></summary> <br /> The proxy-agent layer is now used in both `onEventHandler` and `isCompleteHandler` lambdas in order to support proxy configurations. Because of this change, i've deprecated the original `onEventLayer` and created a new prop `proxyAgentLayer` since we will now be passing this prop into more than just the `onEventHandler` Lambda. The `onEventLayer` prop was introduced [a few weeks ago (sept 24)](aws#16657) so it should not impact many users (if any). The prop would only be used if the user wishes to bundle the layer themselves with a custom proxy agent. This prop follows the [same user customization we allow with the kubectl handler](https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-eks.Cluster.html#kubectllayer). Another suitable name for this prop could have been `clusterHandlerLayer` but I chose `proxyAgentLayer` because it represents **what** the layer is used for, instead of describing **where** it's used. This also follows the convention of the pre-existing [`kubectlLayer` prop](https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-eks.Cluster.html#kubectllayer). </details> <details> <summary><strong>Adds the EKS cluster prop <code>clusterHandlerSecurityGroup</code></strong></summary> <br /> If a proxy address is provided to the Cluster Handler Lambdas, but the proxy instance is not open to the world, then the dynamic IPs of the Cluster Handler Lambdas will be denied access. To solve this, i've implemented a new Cluster prop `clusterHandlerSecurityGroup`. This `clusterHandlerSecurityGroup` prop will allow the user to pass a Security Group to both Lambda functions and the Custom Resource provider. This is very similar to how we [already allow users to pass Security Groups to the Kubectl Handler](https://github.com/aws/aws-cdk/blob/7f194000697b85deb410ae0d7f7d4ac3c2654bcc/packages/%40aws-cdk/aws-eks/lib/kubectl-provider.ts#L83) </details> ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Hi, |
At this moment(cdk 2.63.0), it's possible to deploy a private eks endpoint with nodegroup in the const cluster = new eks.Cluster(this, 'Cluster', {
vpc,
version: eks.KubernetesVersion.V1_24,
// private endpoint only
endpointAccess: eks.EndpointAccess.PRIVATE,
vpcSubnets: [
{ subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
],
// lambda handler with vpc access
placeClusterHandlerInVpc: true,
kubectlLayer: new KubectlLayer(this, 'KUbectlLayer'),
defaultCapacity: 0,
});
// nodegroup in privage subnet with egress to access internet without any vpc endpoints
cluster.addNodegroupCapacity('NG', {
subnets: vpc.selectSubnets({ subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS, }),
}) |
hi @pahud |
If I am using the Isolated subnets in the VPC and we don't use a NAT but instead we direct all outgoing networking through a proxy, is there a way to pass this proxy setup to the EKS Cluster Construct (or somehow to the Lambdas deployed as part of the Cluster construct) ? here is the error I am currently getting when trying to create the EKS Cluster, the error happens on the Lambdas deployed as part of the EKS Cluster L2 Construct when trying to update the k8s cluster auth manifest (logical ID AwsAuthmanifest): (I am also wondering why the STS global https://sts.amazonaws.com/ as I am already using an STS endpoint in the VPC by region, I was expecting this one to be tried to be reached by the Lambdas) |
Yes technically the eks cluster can be associated with isolated subnets but the primary consider for that is - If your lambda function is associated with isolated subnets, it can access the control plane but won't be able to access the EKS service API until some private endpoints are enabled or http_proxy configured. It's still unclear to us how to configure correctly in CDK so I would suggest associate PRIVATE_WITH_EGRESS subnets for vpcSubnets: [
{ subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
], |
I run into the same issue - this should help you:
requires: regional vpc endpoint for sts |
thank you @ClaudiusMZ for your input but that didn't help.
and I was getting So when I saw your comment I thought that you may have a good point there and that will go directly to the regional STS end point as expected but Therefore: I still suspect the handlers don't pick up the proxy, nor reaching directly the STS endpoints defined in VPC as you mentioned (without proxy) Analyzing more on what I wrote above actually I am guessing now WHY the Internet is needed (my original question) and we need to go back to how to setup the proxy: the only way for Control Plane to reach the STS is through internet due to the EKS Control Plane which is AWS Managed. And I will just add for others in case they will bump into this, I am talking specifically for CDK TS, my guts tells me that if I will do CDK Python proxy may work (who knows ?!) the proxy setup in node/js may not be as nice as in python, or is it and I am missing still something very easy here ? |
This issue has received a significant amount of attention so we are automatically upgrading its priority. A member of the community will see the re-prioritization and provide an update on the issue. |
We are also trying to create the EKS cluster through CDK in private subnets (VPC has internet access via proxy) and we are using the enterprise proxy. But we are getting below error, if we try to use the 'placeClusterHandlerInVpc ' true and setting the proxy for cluster handler environment. Cluster instantiation code looks like below: const eksCluster = new eks.FargateCluster(this, 'copito-eks-cluster', { If we don't use 'placeClusterHandlerInVpc' i.e. set it to false, then we are getting below error, Exception: b'Unable to connect to the server: proxyconnect tcp: x509: certificate signed by unknown authority\n' Our enterprise proxy instance shows successful connection with both EKS and STS endpoints. I believe there is no issue with proxy. |
I got this partially working, by creating the necessary VPC endpoints required to get the lambdas to communicate properly without modifying any security groups. I used what @ClaudiusMZ provided above: place_cluster_handler_in_vpc=True, # Place the cluster handler in the VPC
cluster_handler_environment={"AWS_STS_REGIONAL_ENDPOINTS": "regional"},
kubectl_environment={"AWS_STS_REGIONAL_ENDPOINTS": "regional"}, Our VPC has two subnets that are fully private, no NAT Gateways, No IGWs, EKS API is PRIVATE, only routes are to Seems... excessive, but I went through the woes of figuring out what the lambdas needed in order to complete setting up the cluster, and provisioning kube manifests and some helm charts. In this case, I have to host the charts and images in a private ECR in the same VPC. So far it looks possible to accomplish. The CDK documentation leaves a lot to be desired in terms of what is being done behind the scenes in the L2 constructs I am using, but by digging into the typescript cdk codebase I was able to make sense of some of it. |
Still open? |
eks with isolated VPCs support would require either proper proxy configuration or vpc endpoints enablement. Before we get them sorted and include in the aws-eks REAdME, I am leaving this issue open. Feel free to help us improve the document by submitting PRs. |
Provisioning clusters inside an isolated vpc (i.e no internet access) is not currently supported.
This is because the lambda functions that operate the cluster need to invoke the EKS service, which does not offer a VPC endpoint.
Use Case
We've seen users mentioning their environment uses an isolated VPC.
Other
Adding some information here to possibly facilitate alternative approaches.
If you have a proxy setup, you can inject proxy information to the handlers via custom environment variables.
Also, following is a list of AWS services that our Lambda handlers interact with in order to operate the cluster. All of these services offer a VPC endpoint except for EKS.
Related: #10036
Once EKS does offer a VPC endpoint, it would be nice if we just provision the necessary endpoints given if we identify that the VPC does not have internet access (internet gateway, NAT).
This is a 🚀 Feature Request
The text was updated successfully, but these errors were encountered: