-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/helm: migrate to secret storage backend in namespace of CR #1102
pkg/helm: migrate to secret storage backend in namespace of CR #1102
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nits/suggestions left. Overall looks good to me. My only concern is that this is a tempory/transition state, what is the reason behind this? Can't we have a check for this, to create the Secret. Not too familiar with the problem, so just curious to learn more.
The background is that for Helm (specifically the Tiller components of Helm) to function properly, there needs to be a storage backend that stores and manages the release state for every release. Helm has a storage interface for managing this. When starting the helm-operator for the first time, the storage backend is empty. As CRs are added and updated and releases are installed and upgraded, respectively, the storage backend is populated and updated with the latest release state. We're currently using the in-memory storage backend, which means that all of this state disappears when the operator is restarted, so there needs to be some way to repopulate the storage backend when it comes back up. To do that, we save the release state to the CR status and use the This PR swaps out the in-memory storage backend for a secret backend, which means that release state will be stored via the storage backend with secrets, rather than in memory. When the operator restarts, it will no longer need to sync release state to its backend because it's all already there, stored in the cluster as secrets.
The catch with the above is that we need to gracefully handle transitioning release state from the CR status to the secret-based backend. As a follow-on to this PR, we plan to store only the most relevant bits of information about the release (e.g. release name and version) in the CR status and remove the rest of the full release state. With that in mind, we could plan on leaving this Does that make sense? WDYT? |
Thanks for the detailed explanation! It makes sense, maybe we should document this for the user, that we create the Secret in the cluster that stores the information about the release state.
I agree, we should leave it, if at all possible, mainly since users tend to skip releases when upgrading.
Can we just always have it run whenever the secrets are not present in the cluster. For example, what if it happens that the secrets get deleted by accident but the CR status doesn't contain the full release state? Ah, nevermind, the secret creation won't work then as there is no CR status to write to the secret 🤦♀️ . But still curious about the question, can the user in a way create the Secret with the release information "manually"? |
If the secret (but not the release resources) associated with the deployed release gets deleted and there's no release state to sync from the CR, the operator will think that the CR represents a new release that needs to be installed. It will:
This is essentially the same thing that would happen if we didn't have this transition code to sync the CR status into the new secret backend.
It would be difficult to do it by hand, since the Helm package responsible for creating the release secrets stores the state as a gzipped protobuf. Theoretically, they could be restored from a backup though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying!
lgtm 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I would consider moving the code from the internal function into the external one, just removes the extra layer.
I would also consider renaming the unstructured arg r
to something like un
or cr
or resource
. Just didn't know what it was as I was reading the code and had to read back to figure it out.
Description of the change:
Motivation for the change:
This PR is the first step towards completing #1100. We'll also be able to close #917, whose goal is to migrate to a persistent storage backend.