Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New autodetection method KubernetesInternalIP added #1242

Merged
merged 1 commit into from
Dec 10, 2021

Conversation

hanamantagoudvk
Copy link
Contributor

@hanamantagoudvk hanamantagoudvk commented Oct 11, 2021

Description

New autodetection method KubernetesInternalIP / KubernetesInternalIP6 being introduced so that calico can always pick the node IP deterministically.

Fixes projectcalico/calico#4870

Fixes projectcalico/calico#5090

@CLAassistant
Copy link

CLAassistant commented Oct 11, 2021

CLA assistant check
All committers have signed the CLA.

@caseydavenport
Copy link
Member

Tagging for v3.22

@hanamantagoudvk
Copy link
Contributor Author

@caseydavenport : Could you please prioritize it for v3.21 ? Our customer deployment needs this solution.

Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass - thanks for the PR! This is mostly looking good but I think we need to make a few structural changes to reduce the scale impact that a periodic node query on every node in the cluster can have. We'll also need unit tests for this in order to merge.

Let me know if you have any questions!

var config *rest.Config
var k8sNode *v1.Node
var err error
if config, err = rest.InClusterConfig(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code needs to work on non-kubernetes clusters as well - I'd suggest modeling after the logic above: https://github.com/projectcalico/node/pull/1242/files#diff-e481557bd1e2fed4cea80458ebb02e18797fdf49c42944f4a8d6c23cbfbec3f9R130

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified as per suggestion

@@ -772,6 +809,27 @@ func autoDetectCIDRBySkipInterface(ifaceRegexes []string, version int) *cnet.IPN
return cidr
}

// autoDetectUsingK8sInternalIP reads K8s Node InternalIP.
func autoDetectUsingK8sInternalIP(version int, k8sNode *v1.Node) *cnet.IPNet {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably put this in the autodetection package along with the other methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would require moving the getLocalCIDR function with it, but I think that's OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified as per suggestion

@caseydavenport
Copy link
Member

@hanamantagoudvk just checking in on this one - did you see my latest comments?

@hanamantagoudvk
Copy link
Contributor Author

@hanamantagoudvk just checking in on this one - did you see my latest comments?

@caseydavenport : Sorry for the delay. Uploaded the new patchset. I am still not able to run UT cases in my setup.
I get some error related to setup. I am working on UT part.

@hanamantagoudvk hanamantagoudvk force-pushed the vip-issue branch 4 times, most recently from 8928fb7 to be54246 Compare November 15, 2021 08:44
Copy link
Contributor

@tomastigera tomastigera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have nothing major against this change, tbh I like that it is coming, we did consider it for eBPF and then we went different way. So great! 👍 I will leave it to @caseydavenport to approve.

@@ -78,6 +100,32 @@ func makeNode(ipv4 string, ipv6 string) *libapi.Node {
return n
}

// makeK8sNode creates an v1.Node with some Addresses populated.
func makeK8sNode(ipv4 string, ipv6 string) *v1.Node {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is used only with ipv4 ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as below

Comment on lines 110 to 115
ip6, ip6net, _ := net.ParseCIDR(ipv6)
// Guard against nil here in case we pass in an empty string for IPv6.
if ip6 != nil {
ip6net.IP = ip6.IP
}
ip6Addr = ip6net.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we do this for ipv4? TBH this is test code and we can assume that the test passes in valid addresses, nothing that changes dynamically. It would be more interesting to test different permutations, mix with other address types etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes i want to add various combinations with IPv4 and IPv6 addresses, but not able to mock the GetInterfaces method inside autoDetectUsingK8sInternalIP. Due to that , configureIPsAndSubnets always returns false

@hanamantagoudvk hanamantagoudvk force-pushed the vip-issue branch 2 times, most recently from f7ff901 to 10b46d9 Compare November 19, 2021 11:35
Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly don't understand why we're now passing DEFAULT_INTERFACES_TO_EXCLUDE as an argument through several layers of call stack, especially since it's not used in the new autodetection method. We should revert that.

Otherwise this is looking good.

}

configureAndCheckIPAddressSubnets(ctx, cli, node)
configureAndCheckIPAddressSubnets(ctx, cli, node, k8sNode, DEFAULT_INTERFACES_TO_EXCLUDE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should revert this change to pass DEFAULT_INTERFACES_TO_EXCLUDE all the way through the call stack of the autodetection functions. It seems to be an extra argument just to pass a constant value.

@caseydavenport
Copy link
Member

/sem-approve

@caseydavenport
Copy link
Member

@hanamantagoudvk you've got an import cycle looks like:

package command-line-arguments01:39
	imports github.com/projectcalico/node/pkg/lifecycle/startup01:39
	imports github.com/projectcalico/node/pkg/lifecycle/startup/autodetection01:39
	imports github.com/projectcalico/node/pkg/lifecycle/startup: import cycle not allowed01:39
Makefile:281: recipe for target 'dist/bin//calico-node-amd64' failed

@hanamantagoudvk
Copy link
Contributor Author

package command-line-arguments01:39
	imports github.com/projectcalico/node/pkg/lifecycle/startup01:39
	imports github.com/projectcalico/node/pkg/lifecycle/startup/autodetection01:39
	imports github.com/projectcalico/node/pkg/lifecycle/startup: import cycle not allowed01:39
Makefile:281: recipe for target 'dist/bin//calico-node-amd64' failed

@caseydavenport
The reason for above import cycle is DEFAULT_INTERFACES_TO_EXCLUDE . Due to split in code now , DEFAULT_INTERFACES_TO_EXCLUDE has remained in startup package and its getting used in autodetection package. autodetection package imports startup package and vice-versa.

@caseydavenport
Copy link
Member

The reason for above import cycle is DEFAULT_INTERFACES_TO_EXCLUDE . Due to split in code now , DEFAULT_INTERFACES_TO_EXCLUDE has remained in startup package and its getting used in autodetection package. autodetection package imports startup package and vice-versa.

Should probably just move DEFAULT_INTERFACES_TO_EXCLUDE to the autodetection package then, right?

@hanamantagoudvk
Copy link
Contributor Author

The reason for above import cycle is DEFAULT_INTERFACES_TO_EXCLUDE . Due to split in code now , DEFAULT_INTERFACES_TO_EXCLUDE has remained in startup package and its getting used in autodetection package. autodetection package imports startup package and vice-versa.

Should probably just move DEFAULT_INTERFACES_TO_EXCLUDE to the autodetection package then, right?

@caseydavenport
yes now i have moved DEFAULT_INTERFACES_TO_EXCLUDE into the autodetection pkg.

@hanamantagoudvk
Copy link
Contributor Author

hanamantagoudvk commented Nov 28, 2021

@caseydavenport : Please review it and let me know if anything is pending ? I am not able to see the below errors in General Tests since i dont have access to CI system.

Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM - Thanks for all of the hard work on this @hanamantagoudvk

@caseydavenport
Copy link
Member

@hanamantagoudvk just a static check failure it looks like:

pkg/lifecycle/startup/startup_test.go:86:5: composites: `k8s.io/api/core/v1.NodeAddress` composite literal uses unkeyed fields (govet)04:51
				{v1.NodeInternalIP, ipv4},04:51
				^04:51
pkg/lifecycle/startup/startup_test.go:87:5: composites: `k8s.io/api/core/v1.NodeAddress` composite literal uses unkeyed fields (govet)04:52
				{v1.NodeInternalIP, ipv6},04:52
				^

@hanamantagoudvk
Copy link
Contributor Author

@caseydavenport , what else is failing now ?

@caseydavenport
Copy link
Member

I think they were test flakes, kicked off another build.

@caseydavenport
Copy link
Member

UT for Node IP assignment and conflict checking. Test variations on how IPs are detected. [It] Test with no "IP" env var set 02:25
/go/src/github.com/projectcalico/node/pkg/lifecycle/startup/startup_test.go:90602:25
  Expected02:25
      <bool>: false02:25
  to equal02:25
      <bool>: true02:25
  /go/src/github.com/projectcalico/node/pkg/lifecycle/startup/startup_test.go:902

This test appears to be failing consistently

@caseydavenport caseydavenport merged commit a8098c7 into projectcalico:master Dec 10, 2021
@caseydavenport
Copy link
Member

@hanamantagoudvk thanks for sticking with this! Glad we got it merged.

What was the last issue with that test? I couldn't tell because the commits were squashed.

Also, are you planning to do documentation for this PR? If so I can help.

Similarly, we probably want operator support for this feature as well - I can pick that up if you're strapped for time.

@hanamantagoudvk
Copy link
Contributor Author

@caseydavenport , basically ginkgo orders the testcases alphabetically. So the tests i added were run before the existing one and it was setting the IP env variable and not resetting it. So due to that the testcases were failing. In my local setup , it used to pass on and off. So now i put a code to reset those env variables.

Regarding documentation , yes i can do the documentation.

Ya that operator support , you can pick that up.

@hanamantagoudvk
Copy link
Contributor Author

@caseydavenport , regarding documentation , i believe in calico website where different autodetection methods are listed , we need to this new one as well. Other than anywhere else we need to capture this one ?

If i need to modify the calico website content , which repo i need to look at and modify it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants