-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube-apiserver: healthcheck via sidecar container #9069
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: justinsb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2ebe0c7
to
e755771
Compare
Makefile
Outdated
mkdir -p ${BAZELIMAGES} | ||
DOCKER_REGISTRY="" DOCKER_IMAGE_PREFIX="kope/" KUBE_APISERVER_HEALTHCHECK_TAG=${KUBE_APISERVER_HEALTHCHECK_TAG} bazel build ${BAZEL_CONFIG} --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 //cmd/kube-apiserver-healthcheck:image-bundle.tar.gz //cmd/kube-apiserver-healthcheck:image-bundle.tar.gz.sha1 //cmd/kube-apiserver-healthcheck:image-bundle.tar.gz.sha256 | ||
cp -fp bazel-bin/cmd/kube-apiserver-healthcheck/image-bundle.tar.gz ${BAZELIMAGES}/kube-apiserver-healthcheck.tar.gz | ||
cp -fp bazel-bin/cmd/kube-apiserver-healthcheck/image-bundle.tar.gz.sha1 ${BAZELIMAGES}/kube-apiserver-healthcheck.tar.gz.sha1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to create and ship SHA-1 hashes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's historical and we began migrating to sha256 about a year ago. I'm not sure on the specifics, we can probably talk about a deprecation at some point, however I don't think this PR is the right time to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call - it's a new file, let's try removing the sha1, then we can find out if anyone is still using it (they shouldn't be!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops - our responses crossed. The sha256 support (eca2ac6) should be in kops 1.15, so I am going ahead and removing the sha1. But if it gives us trouble, let's add it back ...
resp, err := httpClient.Do(req) | ||
if err != nil { | ||
klog.Infof("error from %s: %v", target, err) | ||
http.Error(w, "internal error", http.StatusInternalServerError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http.StatusBadGateway
would be more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call!
func (s *healthCheckServer) handler(w http.ResponseWriter, r *http.Request) { | ||
path := r.URL.Path | ||
path = strings.TrimPrefix(path, "/") | ||
tokens := strings.Split(path, "/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My druthers would be to just select(r.URL.Path)
and leave out any parsing of the path. There's a risk of a discrepancy between the parsing here and the parsing in proxyRequest
, leading to request injection vulnerabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea - generally reworked to avoid all parsing!
func (s *healthCheckServer) httpClient() *http.Client { | ||
transport := &http.Transport{ | ||
TLSClientConfig: s.tlsConfig, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems expensive to create a new Transport
for each incoming query. Shouldn't this be one per healthCheckServer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fear is that they contaminate each other (e.g. cookies), but given our use case, you're right - simplifying.
signerCertificate = cert | ||
} | ||
|
||
privateKey, err := NewPrivateKey(2048) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use pki.GeneratePrivateKey()
and don't hardcode the key size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about creating simpler methods, but looking at pki, that's exactly what you've already done!
} | ||
|
||
klog.Infof("signing certificate for %q", id) | ||
cert, err := NewSignedClientCert(id, privateKey, signerCertificate.Certificate, signer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use pki.SignNewCertificate()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
@@ -164,6 +178,13 @@ func (a *AssetBuilder) RemapImage(image string) (string, error) { | |||
} | |||
} | |||
|
|||
if strings.HasPrefix(image, "kope/kube-apiserver-healthcheck:") { | |||
override := os.Getenv("KUBE_APISERVER_HEALTHCHECK_IMAGE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add export KUBE_APISERVER_HEALTHCHECK_IMAGE=
to hack/update-expected.sh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
}) | ||
} | ||
|
||
clientKey, clientCert, err := b.buildClientKeypair(id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have buildClientKeypair()
return a pki.Certificate
and then use its AsBytes()
. Use the pattern in KubeletBuilder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this looks good to me. Holding on the lgtm since John had some comments, however I think some of those can be cleaned up over time as needed.
Makefile
Outdated
mkdir -p ${BAZELIMAGES} | ||
DOCKER_REGISTRY="" DOCKER_IMAGE_PREFIX="kope/" KUBE_APISERVER_HEALTHCHECK_TAG=${KUBE_APISERVER_HEALTHCHECK_TAG} bazel build ${BAZEL_CONFIG} --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 //cmd/kube-apiserver-healthcheck:image-bundle.tar.gz //cmd/kube-apiserver-healthcheck:image-bundle.tar.gz.sha1 //cmd/kube-apiserver-healthcheck:image-bundle.tar.gz.sha256 | ||
cp -fp bazel-bin/cmd/kube-apiserver-healthcheck/image-bundle.tar.gz ${BAZELIMAGES}/kube-apiserver-healthcheck.tar.gz | ||
cp -fp bazel-bin/cmd/kube-apiserver-healthcheck/image-bundle.tar.gz.sha1 ${BAZELIMAGES}/kube-apiserver-healthcheck.tar.gz.sha1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's historical and we began migrating to sha256 about a year ago. I'm not sure on the specifics, we can probably talk about a deprecation at some point, however I don't think this PR is the right time to do it.
97b3f82
to
8a99603
Compare
Scheme: "https", | ||
Host: "127.0.0.1", | ||
Path: r.URL.Path, | ||
RawQuery: r.URL.RawQuery, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really want to pass through an unvetted RawQuery
? It seems risky to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's not trivial.. https://github.com/kubernetes/kubernetes/blob/08e1fd3bb947faf465e8a67d5c7106dbd10840c0/cluster/gce/manifests/kube-apiserver.manifest#L39
I don't know what validation we could / should do here. We could whitelist some queries I guess but I think that'll confuse people more. I imagine two things are coming:
- exposing this for use as a load balancer health-check (i.e. publicly)
- using the richer readyz / livez queries as in the upstream manifest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to parse the query and only pass through parameters we recognize as safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at apiserver code, I see at least exclude
and verbose
, but that http.Request
gets passed around all over the place.
So it's a risk assessment as to how likely it is that apiserver would have or later add an important authorization check on a parameter to those paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we parse and pass through exclude
? I worry about information leakage from verbose
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea - I implemented that. I also made it more testable and added a test!
// KeyEncipherment allows the cert/key pair to be used to encrypt | ||
// keys, including the symmetric keys negotiated during TLS setup | ||
// and used for data transfer. | ||
template.KeyUsage = template.KeyUsage | x509.KeyUsageKeyEncipherment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, x509.KeyUsageDigitalSignature | x509.KeyUsageKeyEncipherment
is the default for pki.SignNewCertificate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - cool. I suggest leaving it for now? I think this function and the similar one to generate a kubeconfig end up being utils.
You can also see the beginnings of a thought I'm exploring here, which is putting less logic into nodeup. The only parameter here is really the path and the CN, we could imagine this just being in the NodeUpConfig (or similar), meaning we would have more insight into exactly what nodeup does (because it would do less)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can leave for now. I'll look into extracting a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - I don't think it's urgent, I just figured that's one way we could go, so it might not matter too much if we only have two of these (kubeconfig & "raw")
8a34253
to
6b11d8b
Compare
Cleaned up the request processing in the proxy & added some tests |
t.Fatalf("unexpected method %q", out.Method) | ||
} | ||
actual = out.URL.String() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test wouldn't catch a bug where mapToProxyRequest
returns an Request
containing a URL of the empty string when it should be returning nil
.
kube-apiserver doesn't expose the healthcheck via a dedicated endpoint, instead relying on anonyomous-access being enabled. That has previously forced us to enable the unauthenticated endpoint on 127.0.0.1:8080. Instead we now run a small sidecar container, which proxies /healthz and /readyz requests (only) adding appropriate authentication using a client certificate. This will also enable better load balancer checks in future, as these have previously been hampered by the custom CA certificate. Co-authored-by: John Gardiner Myers <[email protected]>
/test pull-kops-e2e-kubernetes-aws |
/lgtm |
Cherry pick of #9069 onto release-1.17
kube-apiserver doesn't expose the healthcheck via a dedicated
endpoint, instead relying on anonyomous-access being enabled. That
has previously forced us to enable the unauthenticated endpoint on
127.0.0.1:8080.
Instead we now run a small sidecar container, which
proxies /healthz and /readyz requests (only) adding appropriate
authentication using a client certificate.
This will also enable better load balancer checks in future, as these
have previously been hampered by the custom CA certificate.