Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Activator endpoint probing can race with pre-stop hook #9355

Closed
julz opened this issue Sep 10, 2020 · 5 comments
Closed

Activator endpoint probing can race with pre-stop hook #9355

julz opened this issue Sep 10, 2020 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@julz
Copy link
Member

julz commented Sep 10, 2020

(writing up some slack conversations).

wordy version: Activator has logic to probe the QP when it sees a pod go to notReady state, if the probe succeeds it will treat the pod as ready regardless of its status. This is particularly good at startup because it means we can see the pod is ready and start forwarding traffic before etcd and the kubernetes Rube Goldberg machine have caught up. However, when terminating it is possible for the activator to see the pod go notReady before the QP has received the pre-stop signal (both Endpoints and Kubelet watch for the Terminating state independently, so the pre-stop signal is not guaranteed to be received by the QP before the pod goes notReady in activator's endpoints lister, see https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes). In this case we'll never actually consider the pod notReady (because we saw a good probe), even though it's now in a terminating state and will at some point start failing requests we forward to it.

tl;dr: activator probing of notReady pods should be resilient to case where QP hasn't yet received pre-stop hook.

potential fix: the probing logic is only really needed on startup. When going from never-seen-before to notReady we should probe to get cold start times as low as possible, but when transitioning ready->notReady we can't distinguish readiness blips from terminating pods, and should err on the side of avoiding the race by waiting for the k8s api to report it ready.

note: I found this while looking at a bug we see where intermittently sending a lot of curls in a row we see errors while things are scaling down, I haven't verified yet that this is the problem there -- but it does seem like a problem :).

@julz julz added the kind/bug Categorizes issue or PR as related to a bug. label Sep 10, 2020
@vagababov
Copy link
Contributor

for full picture -- we can ignore reprobing those but:

  1. there's always race anyway -- you just moved it a bit
  2. after second iteration of endpoints changing we already have no idea whether this was a new endpoints that's yet not ready or it's the one that became (sure we can try too bookkeep that, but think of activator restart/re-subsetting)

@markusthoemmes
Copy link
Contributor

We had a bit of a discussion in Slack today, so here are more datapoints:

When a pod is deleted (i.e. scaled down) it isn't actually moved into the nonReady addresses but rather removed from the endpoints completely without that intermediate step. (see https://github.com/kubernetes/kubernetes/blob/119c94214c8b11a9f585557bff49bef26faf88b1/pkg/controller/endpoint/endpoints_controller.go#L415-L418).

With that in mind, is this even still a problem? I mean there is somewhat of a race, sure, but we shouldn't be reprobing and reconsidering pods so I'd think the race isn't better (or worse) than the race seen in any loadbalancer when removing a pod and our shutdown grace period should fix any errors here.

@vagababov
Copy link
Contributor

Yeah, and there's always a race due to informers not being instantenous anyway...

@julz
Copy link
Member Author

julz commented Sep 14, 2020

yah, thinking about this there's a race, but given terminating pods immediately exit without hitting nonReady, I don't think it's nearly so important a race (and, selfishly, I don't think it's the race causing the problem I was initially investigating :D). I'm going to keep digging a little in to this code path, but gonna close this for now since I don't think this'd be a worthwhile change, given our current understanding.

/close

@knative-prow-robot
Copy link
Contributor

@julz: Closing this issue.

In response to this:

yah, thinking about this there's a race, but given terminating pods immediately exit without hitting nonReady, I don't think it's nearly so important a race (and, selfishly, I don't think it's the race causing the problem I was initially investigating :D). I'm going to keep digging a little in to this code path, but gonna close this for now since I don't think this'd be a worthwhile change, given our current understanding.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants