-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws: Add an experimental IPv6 dev/test environment #2555
Conversation
/hold this needs an enhancement, and also no TF_VARs based control should be used. |
Fixed an issue pointed out by CI, dropped the usage of a TF_VAR_ env var, and added an enhancement here: openshift/enhancements#87 |
1b7abf2
to
afce363
Compare
@@ -6,6 +6,10 @@ output "vpc_cidrs" { | |||
value = [data.aws_vpc.cluster_vpc.cidr_block] | |||
} | |||
|
|||
output "vpc_ipv6_cidrs" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use vpc_cidrs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the VPC has both IPv4 and IPv6 CIDRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my question was does the output need to make the distinct? can they be a single list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wouldn't be easy to just use a single list. They are used as input to different options on a resource: cidr_blocks
vs ipv6_cidr_blocks
, so we have to explicitly be using only the IPv4 addresses or only the IPv6 addresses.
@@ -33,6 +48,9 @@ resource "aws_subnet" "private_subnet" { | |||
|
|||
availability_zone = var.availability_zones[count.index] | |||
|
|||
ipv6_cidr_block = var.use_ipv6 == true ? cidrsubnet(data.aws_vpc.cluster_vpc.ipv6_cidr_block, 8, count.index) : "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/openshift/installer/pull/2555/files#diff-2f4127d65f2b75935b1181f433f7bb75R47 looks different from that one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't tell what that link was to. Which other change are you referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the IPV4 cidr is calculated as also similar question to #2555 (comment)
my preference would be to keep it similar..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added comments to hopefully help explain what this is doing. Let me know if you have a more specific suggestion, since I'm not sure what you mean by wanting it similar to IPV4.
# We can't target the NAT gw for our "private" IPv6 subnet. Instead, we target the internet gateway, | ||
# since we want our private IPv6 addresses to be able to talk out to the internet, too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO(@abhinavdahiya): need to get clarity on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of clarity are you looking for? The AWS NAT gateway doesn't support IPv6. They hand out globally routable IPv6 addresses and expect those to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked into doing what we discussed on the enhancement (switching to an egress only internet gateway). A major downside is that tags are not supported. A second potential downside is that we may end up wanting some direct connectivity to Nodes to exercise more of IPv6 with ingress, since AWS load balancers can only do IPv4. At this point I'd suggest we leave this as-is ...
https://docs.aws.amazon.com/cli/latest/reference/ec2/create-egress-only-internet-gateway.html
https://docs.aws.amazon.com/cli/latest/reference/ec2/create-egress-only-internet-gateway.html
This is a hack to not block IPv6 when doing CIDR validation. We may want to do something smarter later, like validate that IPv4 or IPv6 is used consistently, or validate based on per-platform capabilities. For now, removing this check unblocks IPv6 work.
If you run "create cluster", the VPC will have an IPv6 subnet assigned by AWS (a /56). Then, each subnet created in the VPC will get a slice of this (a /64). I can see that the VMs created all had IPv6 addresses associated with them, at least from the AWS perspective. The networking section of my current install-config looks like this: networking: clusterNetwork: - cidr: fd01::/48 hostPrefix: 64 machineCIDR: 10.0.0.0/16 networkType: OVNKubernetes serviceNetwork: - fd02::/112 I kept machineCIDR as IPv4, since on AWS we can't get IPv6 only. We get IPv4 + optionally IPv6. We use machineCIDR to specify the IPv4 address range we carve up in the VPC. For IPv6, AWS gives us the /56 that we carve up, so we don't need a subnet specified in the config. For clusterNetwork and serviceNetwork, both of these are used exclusively inside the cluster, so I changed them to be specified as private IPv6 subnets.
This allows all of our VMs to speak IPv6 out to the internet.
This adds a route on the private subnet to allow internet access through the same internet gateway used by the public subnet. In the IPv4 case, we use a NAT gateway from the private subnet. For IPv6, we don't use NAT since the addresses are publicly routable.
All security group rules that referenced an IPv6 CIDR have been duplicated to reference the corresponding IPv6 CIDR.
All of the IPv4 records remain for now.
Introduce a "use_ipv6" terraform variable that is hard coded to "true". Use it to disable the IPv4 DNS records (A records), leaving only the IPv6 (AAAA) records in place.
Back in commit 16dfbb3, the installer switched from classic load balancers to network load balancers. Unfortunately, NLBs don't support IPv6. This commit introduces conditional classic load balancers when use_ipv6 is turned on.
Apply the use_ipv6 boolean in some places that were added before the boolean existed.
RHCOS doesn't get an IPv6 address during boot before running ignition, so without this record it will fail to contact the machine config server. Turn this record back on as a temporary workaround until we can do IPv6 during ignition.
@@ -2,3 +2,6 @@ output "ip_addresses" { | |||
value = aws_network_interface.master.*.private_ips | |||
} | |||
|
|||
output "ipv6_addresses" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also similar question to #2555 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one seems like a toss-up. The master module does not take a use_ipv6
input, which would be needed to figure out what to put in the ip_addresses
output variable. I could drop ipv6_addresses
, but would add an input variable instead, so it doesn't seem much better. I'll change it if you prefer it that way, though.
resource "aws_subnet" "public_subnet" { | ||
count = var.public_subnets == null ? length(var.availability_zones) : 0 | ||
|
||
vpc_id = data.aws_vpc.cluster_vpc.id | ||
cidr_block = cidrsubnet(local.new_public_cidr_range, 3, count.index) | ||
availability_zone = var.availability_zones[count.index] | ||
|
||
ipv6_cidr_block = var.use_ipv6 == true ? cidrsubnet(data.aws_vpc.cluster_vpc.ipv6_cidr_block, 8, count.index + length(var.availability_zones)) : "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/openshift/installer/blob/master/data/data/aws/vpc/vpc-public.tf#L50 is how the IPv4 address is calculated.
my preference would be to keep this calculated similar looking if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments to hopefully help explain why this is the way it is. Let me know if you have a more specific suggestion.
Use an OPENSHIFT_INSTALL_ prefix instead of just OPENSHIFT_.
Add more details that have come up in review so far.
btw ... I'll squash all of this once we're happy with it. I'm still working through addressing comments and will post a clear comment indicating when it's ready for another look. Thanks! |
These comments were about something to check related to a terraform upgrade. The code is right as it is. As recommended by the comment itself, these TODO comments should just be removed now that it has been checked.
Note a todo item that I will take care of soon: changing to an egress-only internet gateway. https://aws.amazon.com/premiumsupport/knowledge-center/configure-private-ipv6-subnet/
This variable was added in an earlier commit, but is no longer used.
Move the conditional part of the expression to only apply to the part of the formatlist() that's different between IPv4 and IPv6.
This patch mostly just moves the count to the beginning of a resource as a style change, as requested during review. While doing this, I noticed two IPv6 security group rules that were not conditional on IPv6 support being turned on.
Make use of the existing etcd_ip_addresses instead.
As requested in review: - when aws IPv6 is on, force OVNKubernetes network type - Force IPv6 CIDRs where needed when AWS IPv6 is on - Re-add IPv4 CIDR enforcement everywhere else - make sure the platform is AWS if the aws IPv6 env var is set
I've added these validations, and I believe either responded to or addressed all other comments. @abhinavdahiya This is ready for another look. Once you're happy with it, I can squash all of these commits. |
/retest |
/retest |
I changed this code but forgot to fix the test case.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test e2e-aws |
@russellb: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@russellb: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/close |
@russellb: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This PR includes support for an IPv6 dev/test environment on AWS. It is off by default and enabled only by setting the
OPENSHIFT_INSTALL_AWS_USE_IPV6
environment variable. Seedocs/dev/aws_ipv6.md
for some more information.The intention is to not change any behavior by default and just have this as something developers can turn on as we work across OpenShift to resolve IPv6 issues.