-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch vendored etcd to v3.5.6 #114403
Switch vendored etcd to v3.5.6 #114403
Conversation
Signed-off-by: Davanum Srinivas <[email protected]>
@dims: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig architecture |
/priority important-soon |
/assign @liggitt |
ONE more down!!
|
/assign @serathius |
/lgtm |
return (strings.Compare(cancelReasonError.Error(), v3rpc.ErrGRPCInvalidAuthToken.Error()) == 0) || | ||
(strings.Compare(cancelReasonError.Error(), v3rpc.ErrGRPCAuthOldRevision.Error()) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is stringifying the parameter error twice when it doesn't need to, and is stringifying constant API errors twice when they will never change
it's also unclear that exact string comparisons are correct for these... is there zero chance they will get wrapped by other errors that will add prefixes or suffixes? I'm surprised this isn't checking errors in a typed way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ahrtr @serathius
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is stringifying the parameter error twice
Indeed there is no need to stringify the error twice. I just delivered a PR to improve this: etcd-io/etcd#14935
Thanks @liggitt for pointing this out.
it's also unclear that exact string comparisons are correct for these.
It's correct here. etcd defines a set of gRPC errors on server side. We need to translate the error to a string before responding to the client side.
The logic here is that we need to retry on some error types (ErrGRPCInvalidAuthToken
and ErrGRPCAuthOldRevision
) on client side. So we need to compare the error message. Note that we have strict error messages definition, there is no any wrap & unwrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
etcd defines a set of gRPC errors on server side. We need to translate each error to a string before responding to the client side.
I'm really surprised the client is expected to compare error message strings to drive behavior. Isn't that fragile with skewed client and server versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really surprised the client is expected to compare error message strings to drive behavior. Isn't that fragile with skewed client and server versions?
This is a valid concern. This should be a technical debt. We may consider to define client side codes in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will kick off a discussion in the etcd community later, get you CCed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI. etcd-io/etcd#14992
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI. etcd-io/etcd#14992
@ahrtr will you be doing a v3.5.x release with that change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dims Yes, will do it for 3.5 today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI. etcd-io/etcd#14995
cancelReasonError := v3rpc.Error(errors.New(pbresp.CancelReason)) | ||
if shouldRetryWatch(cancelReasonError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks really weird... is this handling a canceled flow in the created case? is that correct?
even if that's correct, it looks like this is triggering even in non-error cases (pbresp.CancelReason == "") - it's always creating a new error, a new rpc error, and calling shouldRetryWatch
(I also commented in that method that stringification is happening more than it should)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ahrtr @serathius
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above #114403 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if pbresp.CancelReason is empty, shouldn't we avoid all of this added code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved.
/hold for a question on the etcd client change to the watch Created handling path |
/retest |
/lgtm Thanks @dims |
@ahrtr: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahrtr, dims, serathius The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/remove-sig api-machinery |
In order to fix etcd-io#12385, PR etcd-io#14322 introduced a change in which the client side may retry based on the error message returned from server side. This is not good, as it's too fragile and it's also changed the protocol between client and server. Please see the discussion in kubernetes/kubernetes#114403 Note: The issue etcd-io#12385 only happens when auth is enabled, and client side reuse the same client to watch. So we decided to rollback the change on 3.5, reasons: 1.K8s doesn't enable auth at all. It has no any impact on K8s. 2.It's very easy for client application to workaround the issue. The client just needs to create a new client each time before watching. Signed-off-by: Benjamin Wang <[email protected]>
/test pull-kubernetes-dependencies |
@dims: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@dims: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In order to fix etcd-io#12385, PR etcd-io#14322 introduced a change in which the client side may retry based on the error message returned from server side. This is not good, as it's too fragile and it's also changed the protocol between client and server. Please see the discussion in kubernetes/kubernetes#114403 Note: The issue etcd-io#12385 only happens when auth is enabled, and client side reuse the same client to watch. So we decided to rollback the change on 3.5, reasons: 1.K8s doesn't enable auth at all. It has no any impact on K8s. 2.It's very easy for client application to workaround the issue. The client just needs to create a new client each time before watching. Signed-off-by: Benjamin Wang <[email protected]>
In order to fix etcd-io#12385, PR etcd-io#14322 introduced a change in which the client side may retry based on the error message returned from server side. This is not good, as it's too fragile and it's also changed the protocol between client and server. Please see the discussion in kubernetes/kubernetes#114403 Note: The issue etcd-io#12385 only happens when auth is enabled, and client side reuse the same client to watch. So we decided to rollback the change on 3.5, reasons: 1.K8s doesn't enable auth at all. It has no any impact on K8s. 2.It's very easy for client application to workaround the issue. The client just needs to create a new client each time before watching. Signed-off-by: Benjamin Wang <[email protected]>
We updated etcd server version to v3.5.6 in #114093
In this PR, we update the vendored dependency as well.
Related to #104007 as well, we got rid of one more redundant dependency
form3tech-oss/jwt-go
Signed-off-by: Davanum Srinivas [email protected]
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: