-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Weave MTU configurable and configure jumbo frame support for new clusters on AWS #2613
Make Weave MTU configurable and configure jumbo frame support for new clusters on AWS #2613
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Hi @jordanjennings. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
CLA signed |
43d19c3
to
8559e0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason that we have not dropped in the change to use NonMasq yet is because there are questions about upgrading. Can you test that upgrading from the old yaml to the new yaml using the NonMasq CIDR works ok? Or just pull it from the PR.
@k8s-bot ok to test |
@chrislovecnm I tested editing an existing cluster and after running update both environment variables were added to k8s-1.6.yaml and pre-k8s-1.6.yaml:
Does that cover it? |
Where old deployments happy? And new deployments as well? |
@chrislovecnm Is there any existing mechanism to replace the CNI DaemonSets with rolling updates? It would be similar to the process for switching CNI providers. Is that a supported change via kops currently? |
A good use case is create a cluster with kops 1.5.x & k8s 1.5.5 and do a rolling-update with your pr version of kops to 1.5.7 We have some issues with 1.5 -> 1.6 k8s upgrades right now, that is why I would recommend 1.5 k8s. The kops update and kops rolling update will apply the new weave deamonsets. Have an example app running before and install another example app afterwards. Vadiate both function. And both can ping www.google.com. This is a bunch of testing work, and if you can do it we very much appreciate it. |
To add some context, this PR is great. If you do not have time to test the CIDR stuff, please remove it and leave the MTU stuff in. Happy to merge the MTU stuff, or if you want to do the above testing we can merge the CIDR and MTU settings. |
@chrislovecnm I tested as you mentioned by using kops 1.5.3 with k8s 1.5.5 to deploy a new cluster, which worked fine, deployed sample apps, no issues. I checked in S3 and saw the v1.9.3.yaml file for weave as expected. Then I used my branch version of kops to edit the cluster and update k8s to 1.5.7 and set the weave MTU to 8912. In the kops update cluster it updated the bootstrap-channel.yaml file to reference the new weave files pre-k8s-1.6.yaml and k8s-1.6.yaml, but didn't actually put those files in the networking.weave folder in S3. I did a rolling update which worked to get k8s to 1.5.7 but did not deploy the new weave files (presumably because they weren't in S3?). Seems like this is actually an existing issue with kops 1.6.0 and upgrading weave. Suggestions from here? |
You do a rolling update? Ping me on slack on Monday, or we can setup a call |
@k8s-bot ok to test |
8559e0e
to
c637bb7
Compare
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Found an issue when upgrading a cluster without specifying MTU, it puts
With my change, a kops update will set WEAVE_MTU to :
I am setting the default value to 8912 in create_cluster.go, but if the cluster has already been created then that code doesn't apply. Two options:
Thoughts? |
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Not sure what happened with the CLA on last rebase. Did you can your user info? Also this fixes #2085 as well |
Let me change my mind. Let's put it in the manifest. Also, do all instance types in aws support that MTU? |
Ok, upon further testing I'm dropping the IPALLOC_RANGE part of this pull request. Somehow it worked once, but on every other test it has broken the networking and I'm not sure why. When testing a cluster upgrade it makes it so the pod to pod networking does not work, it just says "No route to host" or "Destination Host Unreachable" for pings or curl requests. I tested many new cluster creation and cluster upgrade scenarios and in all cases except one the networking was totally broken. I'm not even sure at this point about the one case where I thought it was working, that may have been a bad scenario. My most recent test was simply creating a new cluster with the default configuration with this branch (meaning MTU and IP CIDR was set) and even in that case pod networking was totally broken. I'm not sure why that is the case, but I am dropping the IP address change and will be testing now with the MTU change only. |
@chrislovecnm Per AWS docs:
Also:
Should we reduce potential issues by not trying to make this the default? I just want it to be configurable somehow, I don't really care personally about it being the default unless others agree that it's a sensible thing to do for everyone. |
@ottoyiu both you and @jordanjennings are going to need to check what cloud provider a kops cluster is living on, and apply the manifests accordingly. It may be good for both of you to agree upon which code to use for checking the cloud provider, so merging is easier. |
f3d79e2
to
7e47280
Compare
OK, final updates pushed. I decided to make the MTU set to jumbo frames only for newly created clusters with this pull request. I am checking for AWS, and in order to do that I had to move down the networking section in create_cluster.go to fall under where it detects the cloud provider. I had to increment the weave version in boostrapchannelbuilder.go in order to get the S3 file updated and DaemonSet updated. Here is what I tested:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of tweaks.
case "weave": | ||
cluster.Spec.Networking.Weave = &api.WeaveNetworkingSpec{} | ||
|
||
if cluster.Spec.CloudProvider == "aws" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still allowing other cloud providers to use this feature, but we can do a follow-up pr.
docs/networking.md
Outdated
You are able to specify your networking type via command line switch or in your yaml file. | ||
The `--networking` option accepts the three different values defined above: `kubenet`, `cni`, | ||
`classic`, and `external`. If `--networking` is left undefined `kubenet` is installed. | ||
|
||
### Weave Example for CNI | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is a example how to use --networking cni - maybe we need to rewrite this ;) Please blackout your changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I found this networking page pretty confusing as newcomer, was trying to help clarify but maybe it's more confusing now :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's clean it up in another PR ;)
version := "1.9.4" | ||
|
||
// 1.9.5-kops.1 = 1.9.4 with WEAVE_MTU configured | ||
version := "1.9.5-kops.1" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does 1.9.4-kops.1 work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per http://semver.org/#spec-item-9 things with labels after the version are considered "pre-release" and have lower precedence than the final version number. So 1.9.4-kops.1 would be treated like a pre-release version of 1.9.4, and 1.9.4 would take precedence.
7e47280
to
6213c51
Compare
@k8s-bot pull-kops-e2e-kubernetes-aws test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
/lgtm Pending on a question for the author |
Pull request for #2600 to make MTU configurable and to set a good default value.
From the discussion in #1171 it looks like nonMasqueradeCIDR is meant for the overlay CIDR, so I added that to the configuration as well.
New config looks like:
I tested the following scenarios with this change:
mtu: 8912
, then running update cluster. The ManagedFile for weave was updated as expected, however I am not sure what the process would be to actually get the settings rolled out.mtu: 8912
as expected)kops create -f mycluster.yml
based on https://github.com/kubernetes/kops/blob/master/tests/integration/privateweave/in-v1alpha2.yaml with minor editsTodo:
Let me know if there's anything else I need to do.
This change is![Reviewable](https://camo.githubusercontent.com/bdad2d5a4be7a00dc3b2426ea57eabd73ef84d8ed5ee05653b2f1501b6ea93ab/68747470733a2f2f72657669657761626c652e6b756265726e657465732e696f2f7265766965775f627574746f6e2e737667)