-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gRPC traffic is not retried because there is no content-length header #7141
Milestone
Comments
hawkw
added a commit
to linkerd/linkerd2-proxy
that referenced
this issue
Oct 29, 2021
Currently, the proxy will retry requests with bodies if and only if they have a `Content-Length` header with a value <= 64 KB. If the request has a body but no `Content-Length` header, we currently assume that its body will exceed the maximum buffered size, and skip trying to retry it. However, this means gRPC requests will never be retried, because it turns out gRPC requests don't include a `Content-Length` header (see linkerd/linkerd2#7141). Whoops! This PR fixes this by changing the retry logic so that `Content-Length` is treated as a _hint_, rather than a requirement. Now, we will preemptively skip retrying requests with bodies when the there is a `Content-Length` header with a value that exceeds the limit, but if no `Content-Length` header is present, we will still attempt to buffer the body for a potential retry. We are still protected against unbounded buffering because the buffering body will stop buffering and discard any previously buffered data if the buffered length ever exceeds the maximum. Requiring a `Content-Length` simply allowed us to skip _trying_. This should probably fix linkerd/linkerd2#7141, although it might be worth testing with a non-Rust gRPC implementation to make sure it works?
olix0r
added a commit
to linkerd/linkerd2-proxy
that referenced
this issue
Nov 1, 2021
Currently, the proxy will retry requests with bodies if and only if they have a `content-length` header with a value <= 64 KB. If the request has a body but no `content-length` header, we currently assume that its body will exceed the maximum buffered size, and skip trying to retry it. However, this means gRPC requests will never be retried, because it turns out gRPC requests don't include a `content-length` header (see linkerd/linkerd2#7141). Whoops! This PR fixes this by changing the retry logic to use `Body::size_hint` to determine if buffering should be attempted. This value will be set from `content-length` when it is set and may be set in additional situations where the body length is known before the request is processed. We are still protected against unbounded buffering because the buffering body will stop buffering and discard any previously buffered data if the buffered length ever exceeds the maximum. Co-authored-by: Oliver Gould <[email protected]>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
What is the issue?
While testing gRPC retries using the emojivoto application, I found that the gRPC requests between the web and voting services are not retryable because the gRPC protocol doesn't have a content-length header, which is part of the logic that is used to determine whether a POST request is retryable.
How can it be reproduced?
linkerd install | kubectl apply -f -
kubectl apply -f https://run.linkerd.io/emojivoto.yml
kubectl apply -f https://raw.githubusercontent.com/BuoyantIO/emojivoto/main/training/service-profiles/voting-svc-profile.yml
linkerd=trace,info
for the web pod: docskubectl port-forward svc/web -n emojivoto 8080:80
curl http://localhost:8080/api/vote?choice=:doughnut:
Logs, error output, etc
linkerd check
outputEnvironment
Possible solution
I'm still researching the gRPC protocol, so no proposed solution. After discussing with the Linkerd maintainers, there are a few things to take into consideration for a solution:
Message-Length
field as part of a DATA frame, but there may be a more reliable way to cover both H1 POST requests and gRPC requestsThe text was updated successfully, but these errors were encountered: