-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New kustomization for pd driver #520
Conversation
deploy/kubernetes/deploy-driver.sh
Outdated
|
||
if [[ ${DEPLOY_OS} = "mixed" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to simplify things and just always include both linux and windows node specs. I guess we may have a problem in the beginning where the windows spec is not available for staging/stable.
What if the kustomization.yaml in image_setup/{dev/staging/stable} apppropriately pulls in node_base/windows as needed?
Then we don't actually need to auto-generate a top level kustomization.yaml and can continue to use the appropriate image_setup/X/kustomization.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first, I thought with multiple bases, it is easy to have arbitrary combination, e.g. linux + stable_image, windows + alpha_image. But the image part does not work as expected in multiple bases. I will simply this with one combined driver instead of having a complicated script here.
@@ -0,0 +1,6 @@ | |||
apiVersion: kustomize.config.k8s.io/v1beta1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming nit: we have more than just image version patches in these directories. We may enable new features as well.
Maybe I would still keep the directory name as "overlays". So the main change is that the node deployment is separated from controller deployment, but we still use the kustomization.yaml from the "overlays" directory to determine which version of the driver to deploy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
namespace: | ||
gce-pd-csi-driver | ||
bases: | ||
- base_setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call this one controller_setup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@@ -82,5 +84,10 @@ spec: | |||
- name: cloud-sa-volume | |||
secret: | |||
secretName: cloud-sa | |||
# https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the other PR, I think we should not include this otherwise the controller won't restart if the node becomes unhealthy. It's only really meant for DaemonSets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, forgot to update it.
spec: | ||
selector: | ||
matchLabels: | ||
app: gcp-compute-persistent-disk-csi-driver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need linux and windows DaemonSets to use different labels. Otherwise the linux DaemonSet may start managing the windows Pods, and vice versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
images: | ||
- name: gke.gcr.io/gcp-compute-persistent-disk-csi-driver | ||
# Don't change stable image without changing pdImagePlaceholder in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep this comment with gcp-compute-persistent-disk-csi-driver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
gce-pd-csi-driver | ||
resources: | ||
- linux/node.yaml | ||
- windows/node.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means windows deployment automatically gets added to stable. Is that what we want? Or should we first add windows to the alpha overlay, and eventually when it becomes stable, we add it to the base?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative is to have the overlays directly specify the os/node.yaml resources that they want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to @Liujingfang1, we can make image update lists reusable for different versions. I think with this change, it can customize every possible way easily.
f90a43d
to
58da354
Compare
/test pull-gcp-compute-persistent-disk-csi-driver-kubernetes-integration |
1 similar comment
/test pull-gcp-compute-persistent-disk-csi-driver-kubernetes-integration |
463c660
to
27e7baf
Compare
@@ -0,0 +1,55 @@ | |||
apiVersion: builtin | |||
kind: ImageTagTransformer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between using this and the images
field that's in the kustomization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, we have append these images with some base. For example, in previous structure, we list most of images in stable version. And stable version needs a base, which point to the base that has both linux and windows setup. Now if all other versions e.g., alpha, dev need to base on stable, all of them must share the same base setup as stable (which contains both linux and windows.). If for alpha, we only want to have linux as base instead of using stable as a base, we have to copy the same image list to the alpha version.
Now with this ImageTagTransformer, we can do any combination easily. The image is just like a resource to use. We can allow alpha to use mixed setup as base, plus images, while stable to use only linux setup, plus images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And stable version needs a base, which point to the base that has both linux and windows setup.
Isn't it the opposite? Stable has just linux, and alpha has linux + windows? So everything can still continue to be off of stable, and then override the images. Unless the issue is you can't have 1 kustomization.yaml with images
that uses another kustomization.yaml that also has images
as its base?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I used the wrong example. My main point is this way we can have arbitrary combination of os + image as needed. This will give the flexibility in case we do want e.g., windows+image for security setup reason (we don't want to have linux security set up mix with it. We might have unnecessary security compromise from linux security policy which we don't actually need in windows cluster). And putting all common things in the same place seems clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tags in this file are only for the "stable" overlay. If we want to follow this model, should we have a file for the images in every overlay?
kubernetes.io/os: windows | ||
containers: | ||
- name: csi-driver-registrar | ||
image: gcr.io/k8s-staging-csi/csi-node-driver-registrar:amd64-windows-v20200428-v1.3.0-26-g510710d5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These images should be substituted in the alpha overlay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have another image yet, I think. Currently I see this is in staging. We need to check when it is able to release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I think we should follow the pattern of the linux node.yaml and not specify a tag in the base, and instead override the tag in the kustomization.yaml
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jingxu97, msau42 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
finally, the test passed. PTAL @msau42 |
@verult could you please review this again since lots of changes from last time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some difficulties picking out the subtleties of why this structure is beneficial. If it's not too much effort, if you could link to previous PRs and/or write a high-level compare and contrast the different approaches you tried, it will not only help me understand more easily but also acts as a justification for future devs.
gce-pd-csi-driver | ||
resources: | ||
- image.yaml | ||
- ../alpha/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we ever need to change the tag for node-driver-registrar in the dev overlay, should it use the image name in base or alpha?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is only for dev, then change the image in the image.yaml file under /dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this dev/image.yaml
, do we set imageTag.name
to the one in base (gke.gcr.io/gcp-compute-persistent-disk-csi-driver-win
) or alpha (gke.gcr.io/k8s-staging-csi/csi-node-driver-registrar-win
)? And does the answer depend on other factors, like maybe the ordering of items in kustomization.yaml
?
I'm not sure if we'll ever need to update node registrar tag name, so we can potentially defer this question until that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no image name gke.gcr.io/gcp-compute-persistent-disk-csi-driver-win (this is for windows) in base (stable).
Generally, the following can happen
in stable, change image name --> nameA
in alpha, change image name --> nameB
if alpha is also based on stable, I think the final image name will be nameA.
Or,
in stable, change image name --> nameA
in alpha, change image nameA --> nameB
if alpha is also based on stable, I think the final image name will be nameB.
This is using ImageTagTransformer. I also tested using both ImageTagTransformer and images section directly in kustomization.yaml. The latter (images section) seems taking the final effect.
There are so many ways to construct this, we just have to try it out.
Previously, we have the following structure. Compared to the current one, the main problem here is linux and windows cannot share the image list (stable). With current structure, image list can be shared. Basically once it is setup, future update do not need to touch overlays. For changes that are related to controller or cluster_setup, it does to base/controller_setup. Any change related to node yaml files go to base/node_setup (linux or windows). Any changes related to images go to images (stable, alpha, dev etc.). In overlay, we could set up a overlay that is a combination of base + node_setup + image. e.g,
|
@@ -147,33 +147,6 @@ roleRef: | |||
apiGroup: rbac.authorization.k8s.io | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more clear to name the subdirectories base/controller and base/node. cluster_setup.yaml seems truly setup config, as it's only applied once, whereas controller.yaml is can be done multiple times if you take the deployment up and down. But then why is it in a controller_setup directory?
So I think it would be better to just have base/controller and base/node.
I'm actually not even sure it's worth it to split the controller and node into separate directories as there aren't that many files and it makes it harder to get a comprehensive view of what's actually going on as you have to drill down. Having node_linux and node_windows is probably sufficient for organization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason we have separate dir between controller and node is that we want to be able to pick a combination of base + os + image as explained in comment here #520 (comment)
I think with one flat structure, we cannot do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I'm not very familiar with kustomize. Is the issue that you can't do something like the following?
resources:
- ../../base/*.yaml
- ../../base/node_linux
In that case I agree the separate controller directory seems necessary. It would still be possible to simplify and have base/{controller,node_linux,node_windows} which will improve the browsing experience.
I also still think that removing the _setup suffix from the directory names would be a nice improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I updated the name.
@@ -0,0 +1,27 @@ | |||
apiVersion: policy/v1beta1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little uncomfortable that the PSPs are apparently identical, but repeated. Can we hoist the common linux & windows config into base/node/?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linux and Windows have different psp names. That's why I cannot make the common into base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the paths are different. One possibility is to put the shared parts into base and the separate linux & windows parts into separate overlays, but the current form might be cleaner imo
Oops sorry I was looking at the latest push and GitHub wouldn't show me the previous version of the file. Not sure if this still applies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After digging more, I think it is still possible to share psp, we could use namesuffix. But it also means we need to add a separate dir (and also kustomization file) for psp in order to do that.
/node/common/psp.yaml
/node/common/kustomization.yaml
/node/linux/kustomization.yaml
node/windows/kustomization.yaml
If it is preferable, then I can update it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is keep it the way it is because more layers = more complexity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, yeah, I agree, we don't need another directory.
Too bad, I don't look forward to keeping the linux & windows PSP specs in sync. But I agree what you currently have seems like the least bad option.
name: imagetag-gcepd-driver | ||
imageTag: | ||
name: gke.gcr.io/gcp-compute-persistent-disk-csi-driver | ||
# Don't change stable image without changing pdImagePlaceholder in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this comment go above name:? This doesn't have anything to do with the tag.
And really probably the test can now be changed to create the appropriate ImageTagTransformer rather than the cruel and dirty hack it currently does. If you agree could you create an issue for that? (might be a good intern task)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you point me the trick part you mean that test is currently using?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the trick is where we rely on people reading a comment to not break all the tests ;)
test/k8s-integration/driver.go in installDriver is where we rely on the image name being what's in the stable repo. Wouldn't it be better to have the test make an ImageTagTransformer and deploy its own overlay so that it works the same as the rest of the kustomize, rather than depend on stable being set to a certain value? (if that's indeed what's going on; the fact that it's not obvious what is going on is another problem).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the test infra is modifying the specified kustomization and replacing the image tag. I agree we can improve the test to instead create a test overlay with an imagetagtransformer. But this also goes back to @verult's question about order of operations and when does the name get changed because dev/staging/stable can all use different image registries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me think out loud briefly. I haven't used kustomize much so I could be missing something.
Do we want the yaml in base/ to be meaningful, or is it only supported by deploying through customize? In that case we could replace the images in base/ with placeholders like ATTACHER_IMAGE and have a newName in all of the image transformers. Then it's clear what is being changed.
With respect to the ordering of multiple changes, though, it looks like there's a bug in kustomize. If one applies two patches that refer to the same field, kustomize errors. But if one applies multiple image transformations, the last one wins (as defined by the list in the kustomization.yaml in the transformer directory).
So that's unfortunate... I can ask in the kustomize repo to see if that really is a bug or for some reason WAI. From our perspective, we just have to be careful I guess, and trace the order of things from kustomization files starting in the overlay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using placeholders like ATTACHER_IMAGE sounds good to me. All the image name placeholder used in /base should not be touched.
All the image replacement will be done in /images dir. For testing, we only need to touch/replace /images/dev/image.yaml file to create ImageTagTransformer when necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I suggest to modify placeholder in a different PR. This PR has already gone through lots of changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM for doing that in a different PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've filed kubernetes-sigs/kustomize#2590 about conflict with image transformers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I cc @Liujingfang1 who is the expert in this area.
name: imagetag-gcepd-driver-alpha-win | ||
imageTag: | ||
name: gke.gcr.io/gcp-compute-persistent-disk-csi-driver-win | ||
# Don't change stable image without changing pdImagePlaceholder in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't belong here in the alpha yaml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it
name: imagetag-node-registrar-win | ||
imageTag: | ||
name: gke.gcr.io/k8s-staging-csi/csi-node-driver-registrar-win | ||
# Don't change stable image without changing pdImagePlaceholder in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
fi | ||
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash | ||
mv kustomize ${INSTALL_DIR} | ||
echo "install kustomizer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kustomizer -> Kustomize
@@ -0,0 +1,27 @@ | |||
apiVersion: policy/v1beta1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is keep it the way it is because more layers = more complexity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the benefits make sense. Thanks for clarifying!
apiVersion: builtin | ||
kind: ImageTagTransformer | ||
metadata: | ||
name: imagetag-csi-rpovisioner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: rpovisioner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
58bf644
to
d12d061
Compare
/retest |
This can work for both linux and windows Also update to the latest version of kustomize bases is deprecated, use resources instead.
/lgtm If there's agreement on putting the image placeholders in a follow-on CL I'm happy enough with this now. |
@mattcary: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
This can work for both linux and windows.
The current structure for kustomization is as follows
This PR also updates kustomize latest versino. "bases" is deprecated, use "resources" instead.
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: