Improve error handling of GCE Attach Errors #298

davidz627 · 2019-06-11T23:32:25Z

Depending on the type of error returned we should be returning:
"ABORTED" for operation pending errors
"RESOURCE_EXHAUSTED" for max volumes attached errors
"FAILED_PRECONDITION" for volume already attached to another node errors

right now we just return an INTERNAL ERROR for any gce attach error during controller publish volume

/cc @msau42

hantaowang · 2019-06-20T00:24:39Z

For attach at least, it seems the codes are not actually used in terms of the isFinalError logic, only logging/events.

From here, it seems that RESOURCE_EXHAUSTED should be used for the problem where too many devices are being attached at the same time, rather than the max devices per node reached issue.

davidz627 · 2019-06-20T18:00:24Z

according to the spec:

Max volumes attached | 8 RESOURCE_EXHAUSTED | Indicates that the maximum supported number of volumes that can be attached to the specified node are already attached. Therefore, this operation will fail until at least one of the existing attached volumes is detached from the node. | Caller MUST ensure that the number of volumes already attached to the node is less then the maximum supported number of volumes before retrying with exponential backoff.

Looks like the external-attacher may be interpreting the error incorrectly. Maybe we should create an issue (or fix) there.

And no matter what the external-attacher says we need to conform to the spec (or change the spec if it is wrong) as the spec is the official contract we support and other entities could interact with our driver in unforseen or unknown ways that depend on Spec functionality

davidz627 · 2019-06-20T18:01:46Z

in fact I would think that it isFinalError=true in the RESOURCE_EXHAUSTED case

fejta-bot · 2019-12-17T03:25:35Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

davidz627 · 2019-12-18T22:14:52Z

/remove-lifecycle stale

fejta-bot · 2020-03-17T23:18:33Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

msau42 · 2020-03-17T23:28:42Z

/lifecycle frozen

hantaowang mentioned this issue Jun 26, 2019

REQUEST: New membership for @hantaowang kubernetes/org#961

Closed

6 tasks

davidz627 added this to the Post-GA milestone Sep 18, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2019

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 18, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 17, 2020

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve error handling of GCE Attach Errors #298

Improve error handling of GCE Attach Errors #298

davidz627 commented Jun 11, 2019

hantaowang commented Jun 20, 2019

davidz627 commented Jun 20, 2019 •

edited

Loading

davidz627 commented Jun 20, 2019

fejta-bot commented Dec 17, 2019

davidz627 commented Dec 18, 2019

fejta-bot commented Mar 17, 2020

msau42 commented Mar 17, 2020

Improve error handling of GCE Attach Errors #298

Improve error handling of GCE Attach Errors #298

Comments

davidz627 commented Jun 11, 2019

hantaowang commented Jun 20, 2019

davidz627 commented Jun 20, 2019 • edited Loading

davidz627 commented Jun 20, 2019

fejta-bot commented Dec 17, 2019

davidz627 commented Dec 18, 2019

fejta-bot commented Mar 17, 2020

msau42 commented Mar 17, 2020

davidz627 commented Jun 20, 2019 •

edited

Loading