-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VolumeAttachment Lister to CSI Provisioner #438
Conversation
Welcome @RaunakShah! |
Hi @RaunakShah. Thanks for your PR. I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
f87fbaa
to
2d1965f
Compare
/ok-to-test |
/assign @xing-yang |
/retest |
cc @jsafrane |
I see a few e2e tests failed complaining about PVs not deleted. We probably need to fix the e2e tests first. |
Seeing if the e2e failure was intermittent or because of this change. |
/lgtm |
Addresses #84226 This change prevents calling DeleteVolume on the CSI plugin for volumes that are still attached to a kubernetes node.
@@ -1035,6 +1038,18 @@ func (p *csiProvisioner) Delete(volume *v1.PersistentVolume) error { | |||
ctx, cancel := context.WithTimeout(context.Background(), p.timeout) | |||
defer cancel() | |||
|
|||
// Verify if volume is attached to a node before proceeding with deletion | |||
vaList, err := p.vaLister.List(labels.Everything()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VolumeAttachment objects will be in thousands in a scaled environment. Listing all VA objects in every Delete callback is quite expensive. Moreover, delete gets invoked continuously until the volume is attached thereby causing further degradation. Can we avoid such expensive call to API server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vaLister
lists VolumeAttachments from in-memory informer cache. "Thousands" should be ok, "millions" could be problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does informer limit how many objects are in the cache? How are we sure that all VolumeAttachment objects will fit into memory? If the lister ends up hitting the API server then this will introduce a performance regression, hence I'm a bit concerned with this code.
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jsafrane, RaunakShah The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I am seeing tests failures in PMEM-CSI with the new external-provisioner 2.0.0 because of "still attached to node" in a situation where NodeUnpublishVolume and NodeUnstageVolume have completed successfully, i.e. the volume shouldn't be attached anymore. This goes on for 5 minutes until the test gives up and redeploys the driver, which involves restarting the external-provisioner. The new instance then does try to delete. I wonder whether the informer can have an outdated VolumeAttachment object for over five minutes? |
Is that problem truly gone? The PMEM-CSI tests are passing locally, but not in the CI, which implies that it depends a bit on timing. |
|
||
for _, va := range vaList { | ||
if va.Spec.Source.PersistentVolumeName != nil && *va.Spec.Source.PersistentVolumeName == volume.Name { | ||
return fmt.Errorf("persistentvolume %s is still attached to node %s", volume.Name, va.Spec.NodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this code be checking that va.Status.Attached
is true?
As it stands, it checks for "is there a VolumeAttachment object for the PV", which is a bit broader than "is the PV attached". I don't know whether that makes a difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed that check based on @msau42 comment - #438 (comment)
IIRC the precedent was another comment in another PR that the existence of volumeattachment
was enough to fail the operation..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand now why: a ControllerPublishVolume call might have failed, in which case the call must be retried or ControllerUnpublishVolume needs to be called (according to the CSI spec). So it's correct to prevent DeleteVolume even when the volume is not really attached.
This change prevents calling DeleteVolume on the CSI plugin for volumes that are still attached to a kubernetes node.
What type of PR is this?
/kind bug
What this PR does / why we need it:
Currently, external-provisioner does not check if a volume is attached to a node before trying to delete it.
This can happen when there's a race between Pod and PVC deletion (like deleting a ns).
NOTE: PVC protection does not apply here because the Pod is deleted from k8s, but the volume is still attached.
Based on discussion in kubernetes/kubernetes#84226, this change adds a volumeattachment lister to verify if a volume is still attached to a node before attempting to delete it.
Testing:
Deleting a volume that is not attached to a node works as expected:
Deleting the namespace and creating a race between detach and delete:
Which issue(s) this PR fixes:
Fixes kubernetes/kubernetes#84226
Special notes for your reviewer:
Does this PR introduce a user-facing change?: