Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PV stays hanging in released state #217

Closed
AcidAngel21 opened this issue May 6, 2020 · 23 comments
Closed

PV stays hanging in released state #217

AcidAngel21 opened this issue May 6, 2020 · 23 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@AcidAngel21
Copy link

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:
I deploy a stateful set with 3 replicas and 3 PVCs (via storageclass). When I delete the statefulset and immediately delete the PVCs, most of the PVs stays hanging in status Released.
When I wait a some seconds before I delete the PVCs, this problem does not occur.
This problem does also not happen with csi-driver 1.0.2.
In vCenter I constantly see the error "The operation is not allowed in the current state". It seems that the driver tries to delete the storage object before it has been detached from the node.

A workaound to remove the hanging PVs is to remove the PV finalizers: kubectl patch pv pvc-*** -p '{"metadata":{"finalizers":null}}'

What you expected to happen:
PVs do not hang in the Released status and are removed.

How to reproduce it (as minimally and precisely as possible):
Deploy a stateful set with 3 replicas and 3 PVCs (via storageclass). Delete the statefulset and immediately delete the PVCs.

Anything else we need to know?:
csi-attacher logs

I0505 12:15:09.079852 1 csi_handler.go:428] Saving detach error to "csi-50820c7e61c23397a1ef7a282e237cae50eeebc6f416cf22fb04a951d21398e2"
I0505 12:15:09.089540 1 csi_handler.go:439] Saved detach error to "csi-50820c7e61c23397a1ef7a282e237cae50eeebc6f416cf22fb04a951d21398e2"
I0505 12:15:09.089591 1 csi_handler.go:99] Error processing "csi-50820c7e61c23397a1ef7a282e237cae50eeebc6f416cf22fb04a951d21398e2": failed to detach: rpc error: code = Aborted desc = pending
I0505 12:15:09.089942 1 controller.go:141] Ignoring VolumeAttachment "csi-50820c7e61c23397a1ef7a282e237cae50eeebc6f416cf22fb04a951d21398e2" change
I0505 12:15:14.026312 1 csi_handler.go:428] Saving detach error to "csi-d2da7a09f4310e2889cdd422ac8e6a8e3cb30f1eb93c1a35d023872acdeb1a9c"
I0505 12:15:14.033809 1 controller.go:141] Ignoring VolumeAttachment "csi-d2da7a09f4310e2889cdd422ac8e6a8e3cb30f1eb93c1a35d023872acdeb1a9c" change
I0505 12:15:14.035294 1 csi_handler.go:439] Saved detach error to "csi-d2da7a09f4310e2889cdd422ac8e6a8e3cb30f1eb93c1a35d023872acdeb1a9c"
I0505 12:15:14.035351 1 csi_handler.go:99] Error processing "csi-d2da7a09f4310e2889cdd422ac8e6a8e3cb30f1eb93c1a35d023872acdeb1a9c": failed to detach: rpc error: code = Aborted desc = pending

csi-controller logs

{"level":"error","time":"2020-05-05T12:15:06.865649379Z","caller":"volume/manager.go:433","msg":"failed to delete volume: \"8290b052-db75-4e06-a181-5ebcf4bdea4c\", fault: \"(*types.LocalizedMethodFault)(0xc000b3e2c0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c157f\"","TraceId":"910d62cb-2091-4e5f-ab36-d9e40f6b2349","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/common/cns-lib/volume.(*defaultManager).DeleteVolume\n\t/build/pkg/common/cns-lib/volume/manager.go:433\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:349\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/
rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:06.865869018Z","caller":"common/vsphereutil.go:351","msg":"failed to delete disk 8290b052-db75-4e06-a181-5ebcf4bdea4c with error failed to delete volume: \"8290b052-db75-4e06-a181-5ebcf4bdea4c\", fault: \"(*types.LocalizedMethodFault)(0xc000b3e2c0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c157f\"","TraceId":"910d62cb-2091-4e5f-ab36-d9e40f6b2349","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:351\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218
\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:06.865947275Z","caller":"vanilla/controller.go:452","msg":"failed to delete volume: \"8290b052-db75-4e06-a181-5ebcf4bdea4c\". Error: failed to delete volume: \"8290b052-db75-4e06-a181-5ebcf4bdea4c\", fault: \"(*types.LocalizedMethodFault)(0xc000b3e2c0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c157f\"","TraceId":"910d62cb-2091-4e5f-ab36-d9e40f6b2349","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:452\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1.
2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:07.529234683Z","caller":"volume/manager.go:433","msg":"failed to delete volume: \"38f57ff1-aa92-40dc-8e8a-371624fce197\", fault: \"(*types.LocalizedMethodFault)(0xc000b3fea0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c1580\"","TraceId":"22936af7-e73c-43c2-839c-823d26f844f8","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/common/cns-lib/volume.(*defaultManager).DeleteVolume\n\t/build/pkg/common/cns-lib/volume/manager.go:433\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:349\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/
rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:07.529710938Z","caller":"common/vsphereutil.go:351","msg":"failed to delete disk 38f57ff1-aa92-40dc-8e8a-371624fce197 with error failed to delete volume: \"38f57ff1-aa92-40dc-8e8a-371624fce197\", fault: \"(*types.LocalizedMethodFault)(0xc000b3fea0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c1580\"","TraceId":"22936af7-e73c-43c2-839c-823d26f844f8","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:351\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218
\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
{"level":"error","time":"2020-05-05T12:15:07.529904148Z","caller":"vanilla/controller.go:452","msg":"failed to delete volume: \"38f57ff1-aa92-40dc-8e8a-371624fce197\". Error: failed to delete volume: \"38f57ff1-aa92-40dc-8e8a-371624fce197\", fault: \"(*types.LocalizedMethodFault)(0xc000b3fea0)({\\n DynamicData: (types.DynamicData) {\\n },\\n Fault: (types.CnsFault) {\\n BaseMethodFault: (types.BaseMethodFault) ,\\n Reason: (string) (len=63) \\\"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n },\\n LocalizedMessage: (string) (len=79) \\\"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\\\n\\\"\\n})\\n\", opID: \"076c1580\"","TraceId":"22936af7-e73c-43c2-839c-823d26f844f8","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:452\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1
.2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}

vshpere-syncer logs

{"level":"error","time":"2020-05-05T12:15:07.274001589Z","caller":"syncer/util.go:43","msg":"Error getting Persistent Volume Claim for volume www with err: persistentvolumeclaim \"www-nginx-topology-0\" not found","TraceId":"5afee8cc-c778-40ab-84fb-6701a1969341","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/syncer.IsValidVolume\n\t/build/pkg/syncer/util.go:43\nsigs.k8s.io/vsphere-csi-driver/pkg/syncer.csiUpdatePod\n\t/build/pkg/syncer/metadatasyncer.go:772\nsigs.k8s.io/vsphere-csi-driver/pkg/syncer.updatePodMetadata\n\t/build/pkg/syncer/metadatasyncer.go:511\nsigs.k8s.io/vsphere-csi-driver/pkg/syncer.podDeleted\n\t/build/pkg/syncer/metadatasyncer.go:503\nsigs.k8s.io/vsphere-csi-driver/pkg/syncer.InitMetadataSyncer.func7\n\t/build/pkg/syncer/metadatasyncer.go:191\nk8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnDelete\n\t/go/pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/controller.go:209\nk8s.io/client-go/tools/cache.(*processorListener).run.func1.1\n\t/go/pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/shared_informer.go:556\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:265\nk8s.io/client-go/tools/cache.(*processorListener).run.func1\n\t/go/pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/shared_informer.go:548\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88\nk8s.io/client-go/tools/cache.(*processorListener).run\n\t/go/pkg/mod/k8s.io/[email protected]+
incompatible/tools/cache/shared_informer.go:546\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:71"}

csi-provisioner logs

E0505 12:15:06.866375       1 controller.go:1334] delete "pvc-70aa8526-bca6-4f6d-835e-1073eaddb01f": volume deletion failed: rpc error: code = Internal desc = failed to delete volume: "8290b052-db75-4e06-a181-5ebcf4bdea4c". Error: failed to delete volume: "8290b052-db75-4e06-a181-5ebcf4bdea4c", fault: "(*types.LocalizedMethodFault)(0xc000b3e2c0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n  BaseMethodFault: (types.BaseMethodFault) ,\n  Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "076c157f"
E0505 12:15:07.530972       1 controller.go:1334] delete "pvc-f2ed636e-3190-4368-8c10-d7256b5700aa": volume deletion failed: rpc error: code = Internal desc = failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197". Error: failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197", fault: "(*types.LocalizedMethodFault)(0xc000b3fea0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n  BaseMethodFault: (types.BaseMethodFault) ,\n  Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "076c1580"
W0505 12:15:07.531196       1 controller.go:936] Retrying syncing volume "pvc-f2ed636e-3190-4368-8c10-d7256b5700aa", failure 0
E0505 12:15:07.531316       1 controller.go:954] error syncing volume "pvc-f2ed636e-3190-4368-8c10-d7256b5700aa": rpc error: code = Internal desc = failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197". Error: failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197", fault: "(*types.LocalizedMethodFault)(0xc000b3fea0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n  BaseMethodFault: (types.BaseMethodFault) ,\n  Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "076c1580"
I0505 12:15:07.531433       1 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolume", Namespace:"", Name:"pvc-f2ed636e-3190-4368-8c10-d7256b5700aa", UID:"73287ac1-5db7-4551-a3a4-37fdcb6498b1", APIVersion:"v1", ResourceVersion:"61687007", FieldPath:""}): type: 'Warning' reason: 'VolumeFailedDelete' rpc error: code = Internal desc = failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197". Error: failed to delete volume: "38f57ff1-aa92-40dc-8e8a-371624fce197", fault: "(*types.LocalizedMethodFault)(0xc000b3fea0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n  BaseMethodFault: (types.BaseMethodFault) ,\n  Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "076c1580"

Environment:

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 6, 2020
@AcidAngel21 AcidAngel21 changed the title PV is hanging in released state PV stays hanging in released state May 6, 2020
@brentwolfram
Copy link

I am having a similar issue moving from version 1.0.0 of the CSI driver to version 2.0.0. I can create PVs, but cannot delete them the majority of the time (about 20% of the time it works). They stay in the release state.

Logs:

csi-attacher:

I0506 12:16:26.400501 1 controller.go:175] Started VA processing "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e"
I0506 12:16:26.400557 1 csi_handler.go:89] CSIHandler: processing VA "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e"
I0506 12:16:26.400572 1 csi_handler.go:140] Starting detach operation for "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e"
I0506 12:16:26.400669 1 csi_handler.go:147] Detaching "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e"
I0506 12:16:26.400704 1 csi_handler.go:542] Found NodeID wuatk8sworker0 in CSINode wuatk8sworker0
I0506 12:16:26.470613 1 csi_handler.go:428] Saving detach error to "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e"
I0506 12:16:26.479926 1 controller.go:141] Ignoring VolumeAttachment "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e" change
I0506 12:16:26.480359 1 csi_handler.go:439] Saved detach error to "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e"
I0506 12:16:26.480399 1 csi_handler.go:99] Error processing "csi-7d8f5cbf2620398933db4179f14efa4bdbcd923ee15a1f41aae0e0f34bacc96e": failed to detach: rpc error: code = Internal desc = volumeID "276ae09e-96a0-4236-a053-7dbea3997318" not found in QueryVolume

csi-controller:

{"level":"error","time":"2020-05-06T12:16:32.724480563Z","caller":"common/vsphereutil.go:351","msg":"failed to delete disk 276ae09e-96a0-4236-a053-7dbea3997318 with error failed to delete volume: "276ae09e-96a0-4236-a053-7dbea3997318", fault: "(*types.LocalizedMethodFault)(0xc000614a80)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n BaseMethodFault: (types.BaseMethodFault) ,\n Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "0655df75"","TraceId":"e559b2ef-d09f-4e42-a0da-075412f4233d","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.DeleteVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:351\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:449\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218
\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}

{"level":"error","time":"2020-05-06T12:16:32.724652761Z","caller":"vanilla/controller.go:452","msg":"failed to delete volume: "276ae09e-96a0-4236-a053-7dbea3997318". Error: failed to delete volume: "276ae09e-96a0-4236-a053-7dbea3997318", fault: "(*types.LocalizedMethodFault)(0xc000614a80)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n BaseMethodFault: (types.BaseMethodFault) ,\n Reason: (string) (len=63) \"CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n },\n LocalizedMessage: (string) (len=79) \"CnsFault error: CNS: Failed to delete disk:Fault cause: vim.fault.InvalidState\\n\"\n})\n", opID: "0655df75"","TraceId":"e559b2ef-d09f-4e42-a0da-075412f4233d","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).DeleteVolume\n\t/build/pkg/csi/service/vanilla/controller.go:452\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5164\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).deleteVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:183\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:92\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/gocsi@v1
.2.1/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_DeleteVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5166\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}

In vCenter I get these two events repeating after I try to delete the volume:

Delete container volume (Completed)
Delete a virtual storage object (Failed - The operation is not allowed in the current state)

(even with version 1.0.2 of the driver, I sometimes get the above message, but the PV is eventually released and datastore is cleaned up)

To rule out permissions issue, I tried using credentials with global admin, but same error occurs.

Upon reverting back to version 1.0.0 or 1.0.2 of the driver (with the proper restrictive permissions), I can add/remove volumes normally with consistency.

Environment:
csi-vsphere version: 2.0.0
vsphere-cloud-controller-manager version: gcr.io/cloud-provider-vsphere/cpi/release/manager:latest
Kubernetes version: v1.15.6
vSphere version: 6.7U3
OS (e.g. from /etc/os-release): Ubuntu 18.04.4 LTS (Bionic Beaver)
Kernel (e.g. uname -a): 4.15.0-99-generic
Install tools: terraform/rancher2 provider

@larhauga
Copy link
Contributor

larhauga commented May 6, 2020

I can confirm that we are experiencing the same issue.
We manage to reproduce the issue by creating a PVC, and a pod related to the claim.
By deleting the PVC first, and then the pod, it is often stuck in Released state.
Both the PV and the volumeattachment are still there, waiting for finalizers.
The disks are deleted in vSphere, and the volume is detached from the node.

yaml to reproduce:
pvc:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: vsphere-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
apiVersion: v1
kind: Pod
metadata:
  name: pod
spec:
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
        claimName: vsphere-pvc
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: task-pv-storage
kubectl delete pvc vsphere-pvc
kubectl delete pod pod

Environment
csi-vsphere version: v2.0.0-rc1
vsphere-cloud-controller-manager version: 1.1.0
Kubernetes version: 1.16.2
vSphere version: 6.7u3
OS (e.g. from /etc/os-release): Red Hat Enterprise Linux CoreOS 43.81.202003310153.0 (Ootpa)
Kernel (e.g. uname -a): 4.18.0-147.5.1.el8_1.x86_64
Install tools:
Others:

csi deployment images:
quay.io/k8scsi/csi-attacher:v2.0.0
gcr.io/cloud-provider-vsphere/csi/release/driver:v2.0.0-rc.1
quay.io/k8scsi/livenessprobe:v1.1.0
gcr.io/cloud-provider-vsphere/csi/release/syncer:v2.0.0-rc.1
quay.io/k8scsi/csi-provisioner:v1.6.0

node daemonset images:
quay.io/k8scsi/csi-node-driver-registrar:v1.2.0
gcr.io/cloud-provider-vsphere/csi/release/driver:v2.0.0-rc.1
quay.io/k8scsi/livenessprobe:v1.1.0

@divyenpatel
Copy link
Member

Are you hitting this issue 5 mentioned in the documentation?
https://vsphere-csi-driver.sigs.k8s.io/known_issues.html#issue_5

@AcidAngel21
Copy link
Author

It sounds like that. But why does it work with the CSI driver 1.0.2?

@AcidAngel21
Copy link
Author

AcidAngel21 commented May 7, 2020

I observed the following behaviour in vCenter.
CSI driver 1.0.2 : deletion fails repeatedly while the volume is still attached and after the volume is detached the deletion succeeds.
CSI driver 2.0.0: deletion fails repeatedly while the volume is still attached but there is no further try to detach the volume once this happens.

@alex1989hu
Copy link
Contributor

Our cluster is also affected w/v2.0.0 😞

@divyenpatel
Copy link
Member

Delete Volume is called before Detach volume operation, Delete Volume operation un-tag volume as Container Volume, and later observes volume is attached to the node VM, and does not tag back volume as Container Volume. Detach Volume comes and attempts to query volume to determine if it is file or block, and since volume is not a container volume, Detach Volume operation does not attempt to detach the volume from node VM.

You are observing in v1.0.2 detach attempts are happening as we do not Query Volume to help determine volume is block or file.

This issue is fixed in vSphere 7.0u1.

@RaunakShah is also helping to mitigate this issue by providing the fix for kubernetes/kubernetes#84226 in the external provisioner.

@msau42
Copy link

msau42 commented May 12, 2020

Is it possible for the driver/vsphere to check if volume is attached and fail deletion? This is how other cloud providers behave.

@nickvth
Copy link

nickvth commented May 13, 2020

Same issue here with v2.0.0... not funny to detach from 1 of 20 nodes and delete volumes in fcd manually

@AcidAngel21
Copy link
Author

@divyenpatel "This issue is fixed in vSphere 7.0u1"
Do you really mean 7.0u1? This version isn't released yet.

@divyenpatel
Copy link
Member

Do you really mean 7.0u1? This version isn't released yet.

Yes it is not released yet.

but @RaunakShah has already fixed the race by making a change in the external-provisioner - kubernetes-csi/external-provisioner#438

@RaunakShah
Copy link
Contributor

@AcidAngel21 The fix from external-provisioner is expected to be part of the next release - https://github.com/kubernetes-csi/external-provisioner/commits/v2.0.0-rc2
Once external-provisioner has released this image, we will validate it with our latest CSI driver and will update the YAMLs with the latest images.

@Anil-YadavK8s
Copy link

@RaunakShah Can we use v2.0.0-rc2 to get rid of above issue ?

@longwa
Copy link

longwa commented Aug 18, 2020

Will this fix be available in 6.7U3 with the 1.0.x version of the driver? We have no plans to upgrade to 7.0 in the near future and not being able to delete PV's will be a problem.

@Quantas
Copy link

Quantas commented Sep 6, 2020

I am on vSphere 7.0 and was able to test quay.io/k8scsi/csi-provisioner:v2.0.0. I can confirm that I no longer get stuck PVs after deletion.

@xander-sh
Copy link

@RaunakShah csi-provisioner already released new version of image (v2.0.1)
https://quay.io/repository/k8scsi/csi-provisioner?tag=latest&tab=tags
Can you please validate it image and update deploy YAMLs.

@RaunakShah
Copy link
Contributor

@xander-sh we've validated the latest versions of sidecars and updated the YAMLs in the latest folder. I'll get back to you on whether we're doing that for existing releases as well..

@longwa
Copy link

longwa commented Sep 16, 2020

We are using the version of CSI that installs by default with TKG on 6.7u3. I'm not sure if we can upgrade for this platform so I believe we are stuck with the bug. Hopefully, TKG 1.2 will come out soon and upgrade to the 2.x CSI driver for the 6.7u3 platform, but I'm not holding my breath on that one.

@xander-sh
Copy link

xander-sh commented Sep 16, 2020

@xander-sh we've validated the latest versions of sidecars and updated the YAMLs in the latest folder. I'll get back to you on whether we're doing that for existing releases as well...

Thanks, we are really looking forward to a fix csi-provisioner in the version 6.7u3 of vSphere.

@namgizlat
Copy link

Hi,

is there an update about the fix to version 6.7u3 of vSphere?

@RaunakShah
Copy link
Contributor

vSphere CSI v2.0.1 release is now available - https://github.com/kubernetes-sigs/vsphere-csi-driver/releases/tag/v2.0.1

You will find updated manifests for vSphere 6.7u3 and 7.0 over here - https://github.com/kubernetes-sigs/vsphere-csi-driver/tree/master/manifests/v2.0.1

@RaunakShah
Copy link
Contributor

/close

@k8s-ci-robot
Copy link
Contributor

@RaunakShah: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests