Tolerate InvalidInstanceID.NotFound which deleting instances #594

justinsb · 2016-10-05T05:45:47Z

We treat as instance-already-deleted, i.e. not an error

We treat as instance-already-deleted, i.e. not an error Fix kubernetes#592

Sometimes we see the following error at the end of a rolling update: I1125 18:37:57.161591 165 instancegroups.go:419] Cluster validated; revalidating in 10s to make sure it does not flap. I1125 18:38:08.536470 165 instancegroups.go:416] Cluster validated. error deleting instance "i-XXXXXXXXXXX", node "ip-XXX-XXX-XXX-XXX.XXXXXXX.compute.internal": error deleting instance "i-XXXXXXXXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXXXX' does not exist status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0 It's possible that the EC2 instance to have disappeared by the time it was detached (for example, it may have been a spot instance). In any case, we can't do much when we do not find an instance id, and breaking the update because of that is not user friendly. As such, we can simply report and tolerate this problem instead of exiting with non-zero code. This is similar to how we handle missing EC2 when updating an IG[1] [1] kubernetes#594

Sometimes we see the following error at the end of a rolling update: I1125 18:12:46.467059 165 instancegroups.go:340] Draining the node: "ip-X-X-X-X.X.compute.internal". I1125 18:12:46.473365 165 instancegroups.go:359] deleting node "ip-X-X-X-X.X.compute.internal" from kubernetes I1125 18:12:46.476756 165 instancegroups.go:486] Stopping instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal", in group "X" (this may take a while). E1125 18:12:46.523269 165 instancegroups.go:367] error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXX' does not exist status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0 It's possible that the EC2 instance to have disappeared by the time it was detached (it may have been a spot instance for example) In any case, we can't do much when we do not find an instance id, and throwing this error during the update is not very user friendly. As such, we can simply report and tolerate this problem instead of exiting with non-zero code. This is similar to how we handle missing EC2 when updating an IG[1] [1] kubernetes#594

Sometimes we see the following error during a rolling update: I1125 18:12:46.467059 165 instancegroups.go:340] Draining the node: "ip-X-X-X-X.X.compute.internal". I1125 18:12:46.473365 165 instancegroups.go:359] deleting node "ip-X-X-X-X.X.compute.internal" from kubernetes I1125 18:12:46.476756 165 instancegroups.go:486] Stopping instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal", in group "X" (this may take a while). E1125 18:12:46.523269 165 instancegroups.go:367] error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXX' does not exist status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0 It's possible that the EC2 instance to have disappeared by the time it was detached (it may have been a spot instance for example) In any case, we can't do much when we do not find an instance id, and throwing this error during the update is not very user friendly. As such, we can simply report and tolerate this problem instead of exiting with non-zero code. This is similar to how we handle missing EC2 when updating an IG[1] [1] kubernetes#594

googlebot added the cla: yes label Oct 5, 2016

Tolerate InvalidInstanceID.NotFound when deleting instances

5137e25

We treat as instance-already-deleted, i.e. not an error Fix kubernetes#592

justinsb force-pushed the fix_592 branch from 773bc57 to 5137e25 Compare October 5, 2016 06:12

justinsb merged commit 0cafb24 into kubernetes:master Oct 6, 2016

hwoarang mentioned this pull request Nov 26, 2020

Tolerate missing detached EC2 instances #10319

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tolerate InvalidInstanceID.NotFound which deleting instances #594

Tolerate InvalidInstanceID.NotFound which deleting instances #594

justinsb commented Oct 5, 2016

Tolerate InvalidInstanceID.NotFound which deleting instances #594

Tolerate InvalidInstanceID.NotFound which deleting instances #594

Conversation

justinsb commented Oct 5, 2016