Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the EnableCSI feature flag #6694

Closed
reasonerjt opened this issue Aug 23, 2023 · 22 comments
Closed

Remove the EnableCSI feature flag #6694

reasonerjt opened this issue Aug 23, 2023 · 22 comments

Comments

@reasonerjt
Copy link
Contributor

Since v1.12 velero provides the default datamover and a framework for customized datamover, therefore the EnableCSI may not be needed anymore, we should remove this feathre flag, update the code and doc.

After that, as long as CSI plugin is installed velero should be able to snapshot and restore pv provisioned by CSI drivers.

@reasonerjt reasonerjt added the 1.13-candidate issue/pr that should be considered to target v1.13 minor release label Aug 23, 2023
@shawn-hurley
Copy link
Contributor

Does this have to follow some deprecation policy?

@blackpiglet
Copy link
Contributor

There is another issue opened to address the similar scenario:
#6585

@sseago
Copy link
Collaborator

sseago commented Aug 29, 2023

@shawn-hurley First thing we need to do is actually have a deprecation policy.

The good news is we have one proposed. Now might be a good time to revisit the PR and merge it or fix it (if it still needs work):
#5532

@sseago
Copy link
Collaborator

sseago commented Aug 29, 2023

Although -- if a feature flag is needed to enable (not-yet-GA) functionality, how does removing a feature flag fit in with deprecation policies? I'm imagining that once the feature is GA, the first thing to do is to enable it always and the feature flag becomes a no-op. At the same time we should probably deprecate the feature flag. Once deprecated, it would then follow the deprecation policy for when to remove it.

@shubham-pampattiwar should the deprecation policy PR explicitly address removing feature flags? Removing it because behavior has been deprecated (and eventually removed) is one use case, but removing it because the flag is no longer needed (because it's always enabled) is another one. I don't know whether those details are needed in the policy, or whether the standard "deprecate it, then remove in N releases" general policy is sufficient here.

@reasonerjt reasonerjt self-assigned this Aug 30, 2023
@reasonerjt
Copy link
Contributor Author

I used to think the feature flag is a temporary switch due to lack of confidence about the feature.

removing it because the flag is no longer needed (because it's always enabled) is another one. I don't know whether those details are needed in the policy, or whether the standard "deprecate it, then remove in N releases" general policy is sufficient here.

If we agree with this, we probably don't need to follow deprecation policy for feature flags?

@reasonerjt reasonerjt added the Needs triage We need discussion to understand problem and decide the priority label Aug 30, 2023
@reasonerjt
Copy link
Contributor Author

If user wants to disable CSI snapshot of velero, he/she can just skip installing the CSI plugin when velero is installed.
I don't think of a good use case where the CSI plugin installed but not turning on EnableCSI flag.

@shubham-pampattiwar
Copy link
Collaborator

@sseago @reasonerjt I can add a note about feature flags in the deprecation policy PR.

@sseago
Copy link
Collaborator

sseago commented Aug 31, 2023

I think the right process is as I described earlier. First, always enable CSI. Feature flag does nothing, and remove all of the "if enableCSI" checks in the code. At the same time, deprecate the flag. If a user specifies it, nothing breaks, but it's no longer required to enable CSI -- as you suggested, installing the CSI plugin is all that's needed to enable CSI. Then, 2 releases later (according to the deprecation policy) we remove the flag entirely.

The reason we still need to deprecate it is that when we eventually remove it, users who are specifying the feature flag will get an error message saying that the feature they enabled is invalid. If they have 2 releases of deprecation, they'll know to stop using that flag before it's actually removed.

Essentially, any user-visible feature (via flag, CRD, etc.) should never be removed without notice. It must always be deprecated first, and then removed in a future version.

@shawn-hurley
Copy link
Contributor

Another thing to add/consider, you may want to consider if someone turns it off, but has the plugin installed, exposing some warning that this flag has no semantic meaning.

@sseago
Copy link
Collaborator

sseago commented Aug 31, 2023

@shawn-hurley Yes. I imagine that should be part of the deprecation warning. If a user enables that flag in the installer, we'd want to notify the user that this flag is now deprecated, the feature is always enabled, so the flag does not change any functionality, and that if they do not want to use CSI, they simply need to not add the CSI plugin. I guess a similar warning could be logged (or created as an event) in the velero pod on startup if this flag is present.

@draghuram
Copy link
Contributor

I think the right process is as I described earlier. First, always enable CSI. Feature flag does nothing, and remove all of the "if enableCSI" checks in the code. At the same time, deprecate the flag. If a user

@sseago, when you say "remove "enableCSI" code, do you mean it should be replaced by a check for the CSI plug-in? If not, code to skip such as this will not work?

https://github.com/vmware-tanzu/velero/blob/main/pkg/backup/item_backupper.go#L518

@blackpiglet
Copy link
Contributor

I think the code will be equal to this, after the change.
CSI is assumed to turn on by default.

	if pv.Spec.CSI != nil {
		log.Infof("Skipping snapshot of persistent volume %s, because it's handled by CSI plugin.", pv.Name)
		return nil
	}

@draghuram
Copy link
Contributor

But what if CSI plugin is not installed? The above code will skip the volume and it will not be processed further.

@sseago
Copy link
Collaborator

sseago commented Sep 1, 2023

@draghuram hmm. What happens if you add CSI feature flag today without CSI plugin? I don't remember whether we error our there or not. But yes, if there is no feature flag, we may need some sort of check to see whether the CSI plugin is installed, because the decision between "use CSI snapshots" vs. "use native snapshots" would be based on whether or not the CSI plugin is installed in the absence of a feature flag.

@draghuram
Copy link
Contributor

draghuram commented Sep 5, 2023

Based on discussion here and also on extensive discussion in today's community call, I would like to propose the following:

  • Keep the flag as is
  • Package CSI plug-in with Velero obviating the need to install it separately.

I think this combination takes care of multiple issues such as:

  • Enabling the flag but not installing the plug-in and vice versa.
  • Installing wrong version of CSI plug-in (as we have repeatedly seen in Slack).

I can't think of any downside to packaging CSI plug-in as long as the feature flag is there. OTOH, there is a serious problem when the feature is enabled but plug-in is not installed where Velero proper will skip some PVCs, assuming they will be handled by CSI plug-in and that will not happen if the plug-in is not present.

if the CSI plug-in is not going to be packaged, backup logic should validate that CSI plugin is installed if EnableCSI flag is specified.

@shawn-hurley
Copy link
Contributor

I was not on the community call, but I think that makes sense, just wanted to put my two cents here :)

@reasonerjt
Copy link
Contributor Author

reasonerjt commented Sep 13, 2023

Package CSI plug-in with Velero obviating the need to install it separately.

We probably don't wanna do it in velero's binary installer, b/c it will couple the velero binary and plugin release. Currently it's possible to release a CSI plugin without bumping up velero's binary.

@reasonerjt reasonerjt added backlog and removed 1.13-candidate issue/pr that should be considered to target v1.13 minor release Needs triage We need discussion to understand problem and decide the priority labels Sep 27, 2023
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.

@blackpiglet
Copy link
Contributor

Not staled

Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.

Copy link

This issue was closed because it has been stalled for 14 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 11, 2024
@GeorgiIvanovITGix
Copy link

Hi everyone,

I wanted to share our recent experience with Velero and the EnableCSI feature flag, which might be relevant to this discussion. We encountered an issue where Velero was skipping volume backups due to the presence of the --features=EnableCSI argument in the Velero deployment. This was unexpected as we hadn't installed the CSI plugin separately.

After investigating, we found that removing the EnableCSI flag from the deployment arguments and conducting a manual backup of a namespace allowed Velero to perform native snapshots of each volume successfully. We verified the snapshots were both present in Velero and visible in the Azure console.

This issue highlights the importance of ensuring that if EnableCSI is enabled, the corresponding CSI plugin must be installed and correctly configured. In our case, the presence of the flag without the plugin led to Velero skipping native snapshot backups, relying instead on an unavailable CSI mechanism.

I believe this experience underscores the need for clarity around the use of feature flags and their dependencies on plugins, as discussed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants