Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement TLS bootstrap for kubelet using --experimental-bootstrap-kubeconfig (2nd take) #30922

Merged
merged 4 commits into from
Aug 21, 2016

Conversation

yifan-gu
Copy link
Contributor

@yifan-gu yifan-gu commented Aug 18, 2016

Ref kubernetes/enhancements#43 (comment)

cc @gtank @philips @mikedanese @aaronlevy @liggitt @deads2k @errordeveloper @justinsb

Continue on the older PR #30094 as there are too many comments on that one and it's not loadable now.


This change is Reviewable

@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request) and all commit authors, but as best as we can tell these commits were authored by someone else. If that's the case, please add them to this pull request and have them confirm that they're okay with these commits being contributed to Google. If we're mistaken and you did author these commits, just reply here to confirm.

@mikedanese
Copy link
Member

there are too many comments on that one and it's not loadable now.

:) we should switch to gerrit

@@ -335,6 +335,17 @@ func run(s *options.KubeletServer, kcfg *KubeletConfig) (err error) {

if kcfg == nil {
var kubeClient, eventClient *clientset.Clientset

if s.KubeConfig.Value() != "" && s.BootstrapKubeconfig != "" {
Copy link
Contributor Author

@yifan-gu yifan-gu Aug 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt IIUC KubeConfig.Value() is always non-empty (it will have a default value)

@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note-label-needed labels Aug 18, 2016
@yifan-gu yifan-gu assigned mikedanese and unassigned vishh Aug 18, 2016
@yifan-gu
Copy link
Contributor Author

So I pulled from @liggitt's branch for the refactoring work (thank you @liggitt !!), and addressed some additional comments. The only outstanding one left is:

#30094 (comment):

liggitt added a note 16 hours ago
not sure about this... do all cloud providers tolerate being initialized twice?

Is the cloudprovider.InitCloudProvider() idempotent?

@errordeveloper
Copy link
Member

cc @lukemarsden

@yifan-gu
Copy link
Contributor Author

yifan-gu commented Aug 18, 2016

@liggitt I am going to move the cloud initialization before the bootstrap process.

@liggitt
Copy link
Member

liggitt commented Aug 18, 2016

"I am okay with these commits being contributed to Google" or you can squash my commit in

Moving the real cloud provider init earlier is probably better, I just hadn't looked at everything that happened in there to be sure

@yifan-gu yifan-gu force-pushed the tls_bootstrap_refactor branch 3 times, most recently from d8993ab to ae4f9d5 Compare August 18, 2016 21:09
// then it will watch the object's status, once approved by API server, it will return the API
// server's issued certificate (pem-encoded). If there is any errors, or the watch timeouts,
// it will return an error.
func requestClientCertificate(client unversionedcertificates.CertificateSigningRequestInterface, privateKeyData []byte, nodeName string) (certData []byte, err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance you could make this one public or it's unsuitable as is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@errordeveloper Open to make it public, but not sure who else will be using it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd like to make CSRs from kubeadm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@errordeveloper ACK, fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yifan-gu yifan-gu changed the title Implement TLS bootstrap for kubelet using --bootstrap-kubeconfig (2nd take) Implement TLS bootstrap for kubelet using --experimental-bootstrap-kubeconfig (2nd take) Aug 18, 2016
@mikedanese
Copy link
Member

@gtank can you say "I am okay with these commits being contributed to Google"

@gtank
Copy link

gtank commented Aug 18, 2016

I am okay with these commits being contributed to Google

@mikedanese
Copy link
Member

mikedanese commented Aug 18, 2016

@calebamiles

How are you feeling about things @mikedanese, are we fine with a merge once our test failures are resolved?

I'm ok with merging this if/once this feature doesn't affect currently flows (e.g. users not using TLS bootstrap) and it actually (mostly) works. CI also needs to be green.

In a similar vein, @smarterclayton could we address your desired UX in a separate feature. If we're imagining that this would be an experimental feature do you think it could make sense to let some users play around with the proposed workflow and address further UX simplification of cluster bootstrapping with work like #30360 or in a discussion originating out of sig-cluster-lifecycle if that's the appropriate place. I'm also wondering, @detiber if your desires could be addressed as an iteration on some future work to enable automated approval.

I agree with this sentiment. We are 90% to alpha for this feature. More discussion and revision at this point (which likely means holding this feature for another release) is going to be a net negative in the long run as we will lose out on a release of field testing. We've strapped alpha and experimental on this feature. Our hands aren't tied to incrementally improve this feature or rip it out altogether.

@yifan-gu yifan-gu force-pushed the tls_bootstrap_refactor branch from 9cd9c0d to 4120179 Compare August 18, 2016 23:24
resultCh, err := client.Watch(api.ListOptions{
Watch: true,
TimeoutSeconds: &defaultTimeoutSeconds,
FieldSelector: fields.OneTermEqualSelector("metadata.name", req.Name),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this being set, I got errors

Field selector: certificates/v1alpha1 - certificatesigningrequests - metadata.name - csr-dxmh8: need to check if this is versioned correctly.
Error: failed to run Kubelet: cannot watch on the certificate signing request: No field label conversion function found for version: certificates/v1alpha1
failed to run Kubelet: cannot watch on the certificate signing request: No field label conversion function found for version: certificates/v1alpha1

Not sure what's going on here? @liggitt

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Predicates for field selectors need to be written and registerd manually in the apiserver e.g. https://github.com/kubernetes/kubernetes/blob/master/pkg/registry/pod/strategy.go#L193

ref #1362 #28112

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in #30950

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can wait for #30950 or comment out FieldSelector: fields.OneTermEqualSelector("metadata.name", req.Name), with a TODO to re-enable (it'll be an inefficient watch, but the uid check will limit us to processing our own CSR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt ACK

@yifan-gu
Copy link
Contributor Author

Manually tested (steps: https://gist.github.com/yifan-gu/b62588dab290bbb2f1d91e8791ae21d4 ), it works with one questions I posted above.

ObjectMeta: api.ObjectMeta{GenerateName: "csr-"},

// Username, UID, Groups will be injected by API server.
Spec: certificates.CertificateSigningRequestSpec{Request: csr},
Copy link
Member

@liggitt liggitt Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO here for indicating in the Spec that this is a request for a cert with allowed usage of "TLS Web Client Authentication" ("client auth" in cfssl parlance)

@mikedanese @gtank, any update on how you'd like to express that in the CertificateSigningRequestSpec?

Copy link
Member

@mikedanese mikedanese Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until we have a concept of signing profiles, can we only allow client cert allowed use? It seems too late to design this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we only allow client cert allowed use?

Client auth and signing which is useful during the approval challenge.

Copy link
Member

@liggitt liggitt Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that's fine... I'm sort of counting on the alpha-ness of the current API if we end up deciding some sort of spec.usage field should be required and there isn't a clear default value

Copy link
Member

@mikedanese mikedanese Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to surface raw allowed usage in the API, I don't see why we shouldn't just use the x509 extension. Would we ever sign a CSR without the allowed usages that it requested? Would we ever sign a CSR with allowed usages that it didn't request?

I was imaging that the "signing profile" would be closer to what we do with "storage-class" for PDs. For storage, you have silver, platinum, gold... For certs, you would have signature, client, server... The benefit is that configuration matrix (which is sparse w.r.t. valid configurations) is hidden from the user. If we imagine that the only thing we'll ever tweak in a signing profile is allowed usage, then signing profiles are not useful and we should expose allowed usage directly.

I think this conversation is very important but I don't think there is harm in deferring it. The development of the "experimental" api was motivated by allowing this type of experimentation so let's take advantage of the alpha-ness.

Copy link
Member

@liggitt liggitt Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we ever sign a CSR without the allowed usages that it requested?

Probably not

Would we ever sign a CSR with allowed usages that it didn't request?

Maybe (e.g. add in "signing" even if just client cert was requested)? not sure

I think this conversation is very important but I don't think there is harm in deferring it.

sure. given we can scope auto-approval to node requests for a cert of a certain shape, and that the kubelet isn't using the obtained cert for double-duty, I think I'd be ok leaving the default profile as-is. when we figure out how to express requested usage and map to appropriate signing profiles, we can tighten the auto-approval and the kubelet request and nothing else has to change.

Copy link

@gtank gtank Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still experimenting with this. Go's CertificateRequest support only knows about pkix.Extension types, which are pretty raw and- as far as I can tell- under-specified for expressing desired key usages in PKCS#10.

I agree that requested usage is a good thing to expose, but we shouldn't block this feature on something that might need upstream Go changes. I'll take a closer look at what openssl does when I get a chance.

}

if s.BootstrapKubeconfig != "" {
nodeName, err := getNodeName(kcfg.Cloud, nodeutil.GetHostname(s.HostnameOverride))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't kcfg be nil here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, thought I fixed this, maybe forgot to push.

// it will return an error.
func RequestClientCertificate(client unversionedcertificates.CertificateSigningRequestInterface, privateKeyData []byte, nodeName string) (certData []byte, err error) {
subject := &pkix.Name{
Organization: []string{"system:nodes"},
Copy link
Member

@mikedanese mikedanese Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt remind me again, we use OrganizationalUnit for groups, CommonName for user. What is Organization used for? Or was it Organization for Group?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#30392 pulls from Organization

@yifan-gu yifan-gu force-pushed the tls_bootstrap_refactor branch 4 times, most recently from 14e08bc to 8780d84 Compare August 19, 2016 20:43
@liggitt liggitt added cla: human-approved release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed cla: no labels Aug 19, 2016
@liggitt liggitt added this to the v1.4 milestone Aug 19, 2016
Yifan Gu added 2 commits August 19, 2016 13:51
Add --bootstrap-kubeconfig flag to kubelet. If the flag is non-empty
and --kubeconfig doesn't exist, then the kubelet will use the bootstrap
kubeconfig to create rest client and generate certificate signing request
to request a client cert from API server.

Once succeeds, the result cert will be written down to
--cert-dir/kubelet-client.crt, and the kubeconfig will be populated with
certfile, keyfile path pointing to the result certificate file, key file.
(The key file is generated before creating the CSR).
Since the function only tests whether the files are on the disk,
the original name is a little bit misleading.
@yifan-gu yifan-gu force-pushed the tls_bootstrap_refactor branch from 8780d84 to 9950499 Compare August 19, 2016 20:51
@liggitt
Copy link
Member

liggitt commented Aug 19, 2016

changes LGTM. @yifan-gu, let us know when a manual test is complete. @mikedanese has final LGTM

@yifan-gu
Copy link
Contributor Author

@mikedanese How to get the cla bot green, should @liggitt @gtank reply the bot?

@liggitt
Copy link
Member

liggitt commented Aug 19, 2016

the cla:human-approved label will satisfy the merge bot

@yifan-gu
Copy link
Contributor Author

Tested manually after squashing and pulling #30950. Works fine :)

Move bootstrap functions to separate files.
Split some of the functions into small sub-functions for reusability.
Other cleanups
@yifan-gu yifan-gu force-pushed the tls_bootstrap_refactor branch from 9950499 to 26a6623 Compare August 19, 2016 22:27
@mikedanese
Copy link
Member

/lgtm

@yifan-gu yifan-gu added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 19, 2016
@k8s-bot
Copy link

k8s-bot commented Aug 20, 2016

GCE e2e build/test passed for commit 26a6623.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Aug 21, 2016

GCE e2e build/test passed for commit 26a6623.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit a41e6e3 into kubernetes:master Aug 21, 2016
@yifan-gu yifan-gu deleted the tls_bootstrap_refactor branch August 23, 2016 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants