Support for running controller outside of cluster #102

MartinWeindel · 2019-12-02T14:50:23Z

What this PR does / why we need it:
If the control plane runs in another cluster, it is needed to specify kubeconfig path for controller and syncer.

Release note:

Support for running the CSI controller outside the controlled cluster.

k8s-ci-robot · 2019-12-02T14:50:31Z

Welcome @MartinWeindel!

It looks like this is your first PR to kubernetes-sigs/vsphere-csi-driver 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/vsphere-csi-driver has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2019-12-02T14:50:31Z

Hi @MartinWeindel. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

MartinWeindel · 2019-12-02T14:55:49Z

/assign @codenrhoden

divyenpatel · 2019-12-03T00:16:13Z

@MartinWeindel what is the use case of running controller outside of the cluster? Isn't this a security hole?

MartinWeindel · 2019-12-03T06:31:11Z

The use case is providing Kubernetes clusters on vSphere by Gardener.
In Gardener, it is standard procedure to run the control plane for an end-user "shoot" cluster on a so called "seed" cluster, which controls up to hundreds of shoot clusters.
This means the end-user "shoot" cluster is masterless and consists only of worker nodes.
Of course the communication to the K8s API server running on the control plane must be secured with TLS and signed certificates.
Take a look on the Gardener architecture diagram if you are interested in more details.

RaunakShah · 2019-12-11T19:21:04Z

pkg/kubernetes/kubernetes.go

+			klog.Errorf("BuildConfigFromFlags failed %q", err)
+			return nil, err
+		}
+	} else {


What about defaulting to an in cluster config if BuildConfigFromFlags fails?

This would be very confusing. You specify a kubeconfig, but if something fails the controller suddenly tries to control the cluster it is running on.
Better to fail early and report it. If you want to use in-cluster config, just do not provide a kubeconfig.
BTW, your colleges of the vSphere cloud manager are using exactly the same logic, see https://github.com/kubernetes/cloud-provider-vsphere/blob/master/pkg/common/kubernetes/kubernetes.go#L38-L44

CPI have different use cases from CSI. However, on second thought, i agree with the benefits of failing early.

RaunakShah · 2019-12-11T19:24:42Z

pkg/csi/service/cns/nodes.go

+	nodes.nodeRegister(obj)
+}
+
+func (nodes *Nodes) nodeUpdate(oldObj interface{}, newObj interface{}) {


Have you done any testing to validate if nodeUpdate works as expected? I think here that you'd need to unregister the old node object and register the new one if that passes (or vice versa). Otherwise we'd have a long list of nodes registered representing the same node object.

A node update typically updates its status fields. Especially it is forbidden to change the providerID of the spec or its name. Therefore the node registration happens with the same values and the node manager is hopefully idempotent. I'm not an expert here, but removing the node registration on an update of its status looks to me like a very dangerous operation.

The problem here is a timing behaviour. If the CSI controller is already running if new nodes join the cluster
the node object is created first without attributes required by the CSI controller. They are later added by the cloud controller.
But this is then no create event anymore, but an update event. Therefore it is not possible to ignore update events in the CSI controller.

@mandelsoft exactly! Not just in the case of new nodes, but also if CSI is initialized before CPI adds ProviderID to all nodes, we have a situation where not all nodes are registered with the ProviderID set. This is the use case that i see for nodeUpdate.
@MartinWeindel it seems that nodeRegister will modify the existing node entry and not add a new one. But we still need some testing to verify

MartinWeindel · 2020-01-07T12:17:47Z

friendly ping
Have you found time for "some testing to verify"? Or is there still anything to be resolved before this PR can be accepted?
Thanks!

RaunakShah · 2020-01-09T13:46:22Z

@divyenpatel can you review this PR?

divyenpatel · 2020-01-24T23:23:20Z

/ok-to-test

…nt variable; reregister node on update

divyenpatel · 2020-01-27T15:37:20Z

/approve

k8s-ci-robot · 2020-01-27T15:37:29Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: divyenpatel, MartinWeindel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [divyenpatel]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

divyenpatel · 2020-01-27T15:37:40Z

/lgtm

…ency-openshift-4.16-vmware-vsphere-syncer OCPBUGS-24961: Updating vmware-vsphere-syncer-container image to be consistent with ART

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 2, 2019

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 2, 2019

k8s-ci-robot requested review from akutz and divyenpatel December 2, 2019 14:50

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 2, 2019

k8s-ci-robot assigned codenrhoden Dec 2, 2019

RaunakShah reviewed Dec 11, 2019

View reviewed changes

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 24, 2020

support --kubeconfig command line option and KUBECONFIG environme…

c45574f

…nt variable; reregister node on update

MartinWeindel force-pushed the master branch from 6a39d59 to c45574f Compare January 27, 2020 08:41

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 27, 2020

divyenpatel approved these changes Jan 27, 2020

View reviewed changes

k8s-ci-robot assigned divyenpatel Jan 27, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 27, 2020

k8s-ci-robot merged commit 8487000 into kubernetes-sigs:master Jan 27, 2020

skylonwolf mentioned this pull request Jul 10, 2020

Failed to delete PV volumes from vCenter with FaultCode: NotAuthenticated #267

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for running controller outside of cluster #102

Support for running controller outside of cluster #102

MartinWeindel commented Dec 2, 2019

k8s-ci-robot commented Dec 2, 2019

k8s-ci-robot commented Dec 2, 2019

MartinWeindel commented Dec 2, 2019

divyenpatel commented Dec 3, 2019

MartinWeindel commented Dec 3, 2019

RaunakShah Dec 11, 2019

MartinWeindel Dec 13, 2019

RaunakShah Dec 19, 2019

RaunakShah Dec 11, 2019

MartinWeindel Dec 13, 2019

mandelsoft Dec 13, 2019 •

edited

Loading

RaunakShah Dec 19, 2019

MartinWeindel commented Jan 7, 2020

RaunakShah commented Jan 9, 2020

divyenpatel commented Jan 24, 2020

divyenpatel commented Jan 27, 2020

k8s-ci-robot commented Jan 27, 2020

divyenpatel commented Jan 27, 2020

Support for running controller outside of cluster #102

Support for running controller outside of cluster #102

Conversation

MartinWeindel commented Dec 2, 2019

k8s-ci-robot commented Dec 2, 2019

k8s-ci-robot commented Dec 2, 2019

MartinWeindel commented Dec 2, 2019

divyenpatel commented Dec 3, 2019

MartinWeindel commented Dec 3, 2019

RaunakShah Dec 11, 2019

Choose a reason for hiding this comment

MartinWeindel Dec 13, 2019

Choose a reason for hiding this comment

RaunakShah Dec 19, 2019

Choose a reason for hiding this comment

RaunakShah Dec 11, 2019

Choose a reason for hiding this comment

MartinWeindel Dec 13, 2019

Choose a reason for hiding this comment

mandelsoft Dec 13, 2019 • edited Loading

Choose a reason for hiding this comment

RaunakShah Dec 19, 2019

Choose a reason for hiding this comment

MartinWeindel commented Jan 7, 2020

RaunakShah commented Jan 9, 2020

divyenpatel commented Jan 24, 2020

divyenpatel commented Jan 27, 2020

k8s-ci-robot commented Jan 27, 2020

divyenpatel commented Jan 27, 2020

mandelsoft Dec 13, 2019 •

edited

Loading