-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bucket API] update KEP to reflect current API and rpc spec #2349
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/assign thockin |
/api-review |
Is the request here for a top-to-bottom API review of this proposal? |
Yes, essentially. We are proposing to bring this API into Kubernetes as a part of the COSI effort. |
9055f4b
to
158c47e
Compare
Do you want me to review the API as part of this KEP or through those other links? Or do you want to do a video call and show it to me so I can ask questions in real-time? NOTE: I have not looked at it yet, so I don't know what sort of questions I will have :) Aside: Hi @wlan0 ! Nice to see you! |
Hi @thockin, It's great to see you again! I'll follow up with you over email and setup a time for a video call. That'll be the easiest way I think. Looking forward to talking soon! |
3ff4196
to
bb562df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get very far yet, but I wanted to send some quick thoughts. I will keep thinking about the "ownership" problem and the ability for users to change things like anon access after creation. To me, that represents a pretty consequential abstraction problem.
More later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the overall lack of self-service beyond day 0:
Should Bucket be namespaced? Imagine that, instead of being global, I could
create buckets in my own namespace, and then I could manage pretty much every
aspect of them. So if I need to change anonymous access mode I can just do
that. Great for greenfield.
For brownfield it's entirely different, though. It's not clear to me whether
you envision brownfield as "I am taking ownership of a bucket" or "I am
accessing a bucket that is owned by someone else". If the former, the
greenfield case can still kind of play out - it's mine now. If the latter,
maybe there's a distinction between Bucket and BucketReference? This is suggested in large part by the distinct sets of fields in BucketRequest
The problem with self-service is policy. Am I even allowed to change the
anonymous access mode? So not having self-service wins the "how little can I
do" test, but I am not convinced you won't eventually need some self-service,
and so I think we need to think about it.
What we should NOT do is invent a whole new policy and auth construct.
I'll do some API review, but I have some pretty deep questions, obviously. I am not sure if I should just give you the go-ahead to alpha and try this (since it is all out of k/k) or if I should push for deeper discussion (which I know is not the first round of discussion)
I can't inline comment on the interesting parts (yay github) since they are not
under edit in this PR, so I will do it here.
~L172:
Since there is a 1:1 mapping between a BR and a
Bucket
, when multiple BRs request access to the same backend bucket, each separateBucket
points to the same backend bucket.
If I interpret this right, it indicates the "I am taking ownership of a bucket"
model above, right?
~L228:
bucketRequest
: (set by COSI for greenfield, and ignored for brownfield)
If it is ignored for brownfield, and you assume a 1:1 mapping (see above) how
do you enforce that a user does not create N BRs that all point to the same
Bucket?
~L229:
allowedNamespaces
: a list of namespaces that are permitted to either create new buckets or to access existing buckets.
How does a field on a bucket govern creating new buckets?
~L245:
NOTE: the
BucketClass
object is immutable except for the fieldisDefaultBucketClass
Reiterate suggestion to extract this from the type and make it an annotation,
like the others
~L311:
bucketAccessName
: name of the bound cluster-scopedBucketAccess
instance. Set by COSI.
What stops a user from setting this themselves? Is that a security risk?
~L348:
Contents of the ConfigMap that the policyActionsConfigMap field in the
BucketAccessClass
refers to. A sample value of this field could look like:
YOu are showing a list example, not a map - what are they keys? Is this a map
type, a lsit type, or a single giant string?
~L378:
policyActionsConfigMap
: (required) a reference to a ConfigMap
This is starting to feel very baroque. Why do I need a ConfigMap? Why can't I
embed the string-string map directly into the class?
The config map itself is mutable - if I change that, will controllers rectify
and apply the changes to all existing BAs? I think the complexity here is a
bit overwhelming - I'm not sure I could draw you the constellation of resources
needed to get access to a bucket. I'd suggest to see what we can do to reduce
moving parts - is there any denormalization that can be done? You describe a
lot of 1:1 relationships...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As written (apiVersion: cosi.io/v1alpha1
) you don't owe me an API review. If you intend to use *.k8s.io, then I am very wary of approving a KEP that needs (or may need) such significant change. From your email @wlan0, it sounds like you have some answers but they are not in here yet.
As such, I can't sign off on it as k8s.io. I suggest you proceed with it as cosi.io/v1alpha1 for now and when we have confidence in it, we can discuss it as a k8s.io API.
Although not explicitly stated, I am assuming the later. There could be a use case for the former but I don't think that was given serious consideration.
I agree. Even just more closely considering
Agree!
Well, maybe? We haven't defined the concept of a main B vs. cloned B. A brownfield BR, meaning it references an existing B, does not imply ownership IMO. What is your reasoning here?
I'll remove the phrase in parenthesis since it is misleading. What I was trying to describe is that COSI sets B.spec.bucketRequest when it creates a new B on behalf of a greenfield BR. But the admin does not need to populate B.spec.bucketRequest when she creates a brownfield B.
It restricts namespaces a BR can be created or accessed from. It is an property that should be mutable. The idea was to make it easier and more flexible for the admin to control which projects (ns) can create new buckets. But, maybe RBAC can do this and we should not define another policy related field.
Agree. I do not like the idea of a mutable field not inside a
We expect that user are denied access to cluster-scoped resources, like the B and BA. If a user learned the name of an existing BA could they "bind" it in their BAR? If we have a BAR admission controller we can fail the create or empty the field. If it is expected to only be set by COSI we should move it to
It is a json blob whose format is known to the driver.
Our thinking is that the values may be structured strings, like JSON, which do not fit well in a string-string map. A ConfigMap eliminates the need to worry about values and possibly is shareable among multiple BACs.
|
Excellent point.
I'm going to say k8s-bucketto mean the k8s resource and s3-bucket If 1 BucketRequest maps to 1 k8s-Bucket, then it feels like that BR "owns" that This feels like it's not going to work, or it's going to end in fireworks (not If there's going to be ANY sort of access-policy in the API, the data model This is still my primary concern. There's not a choke-point for ownership and this E.g. Make k8s-bucket a 1:1 relationship with s3-bucket. The k8s-bucket This feels, with only 10 minutes of thought, cleaner and less ambiguous? Don't
I'm still confused. You assert elsewhere it's 1:1 BR:B but AFAICT nothing
I'm still confused. This is I don't see how "namespaces that are permitted to ... create new buckets" is
THis field is
What I mean is that a ConfigMap is (duh) a map, but the example shown here is
|
@thockin
I like the idea of one k8s Bucket abstracting one backend bucket (in fact we have proposed this in way past drafts). Then altering bucket properties boils down to editing the matching k8s bucket instance. No clones. No propagating changes. Single source of truth (within k8s). A user needs to be able to trigger the creation of a k8s Bucket and the will add more tomorrow...
Does cosi detect that a given k8s B is not referenced by more than one BR? AFAIK, today the answer is no, and this question highlights yet another challenge with our 1:1 mapping.
OK I understand now.
I agree with this assumption. It seems clear that if a binding resource name is for info only then it should reside in status. If this name can be set by the user to cause bindings then it should live in
I believe the blob is the value of a key not shown in the example, so your k-v pair above is more accurate. But again I need to defer to @wlan0 for his thoughts on this. |
/lgtm |
@xing-yang @krishchow I've updated the KEP based on your latest comments. PTAL |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/assign
for PRR
The questionnaire hasn't been filled out so I cannot review yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments. From my reading, this feature sounds like it is 100% out of tree. If that's the case, PRR is optional. Please confirm.
|
||
### Upgrade / Downgrade Strategy | ||
|
||
No changes are required on upgrade to maintain previous behaviour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrade/downgrade in this case I believe would refer to the addition or removal of the CRDs to the cluster so the API will support them. Can you talk a little about how that is intended to work?
|
||
### Version Skew Strategy | ||
|
||
COSI is out-of-tree, so version skew strategy is N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a node adapter component, does that have any dependencies on a given kubelet version?
|
||
Yes. Delete the resources created when installing COSI | ||
|
||
###### What happens if we reenable the feature if it was previously rolled back? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This question should be answered.
conversion tests if API types are being modified. | ||
--> | ||
|
||
N/A since we are only targeting alpha for this Kubernetes release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is referring to a test of your rollout/rollback strategy (i.e. create the resources, delete the resources). This testing is applicable.
|
||
No | ||
|
||
### Scalability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is encouraged for alpha.
@ehashman Yes, this is 100% out-of-tree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
for PRR
PRR optional, feature is out of tree.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bswartz, ehashman, jeffvance The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First: this was not a super-complete API review. If we proceed with this, we should do it in more depth.
The user-stories are pretty vague, especially the admin one. This is a very expansive API, which largely puts the cluster admin into the middle. There has to be more detail about what admin things are the requirements. I can derive requirements from the API, but I can't tell if they are real or not.
What this is really missing most is a handful of complete examples that speak to the user-stories. As an admin, show me EXACTLY the yaml I would write to enable GCS or S3 as class, As an end-user, show me EXACTLY the yaml I would write to request a greenfield bucket and to attach to a brownfield bucket. etc.
I have to admit, this model still feels very weird to me. The frequent "does not apply to brownfield" sorts of statements and bucket-or-bucketRequest semantics feel like we're jamming two ideas into one API.
I'm still confused about the duality of green/brown. This seems hostile to things like git-ops. I should be able to check my config into git, and then re-apply it should something go wrong. In this world, if I want to move between clusters, my YAML has to change?
I don't really know how to convince myself this is "safe". How do I not end up with users stealing access to buckets that are not theirs? Examples showing those checks and blocks would be super helpful.
I know this is a lot of comments - some small and some not. I'll try to catch up with you all tomorrow.
@@ -110,21 +123,20 @@ When a workload (app pod, deployment, configs) is moved to another cluster, as l | |||
|
|||
+ _backend bucket_ - any bucket provided by the object store system, completely separate from Kubernetes. | |||
+ _brownfield bucket_ - an existing backend bucket. | |||
+ _Bucket (Bucket Instance)_ - a cluster-scoped custom resource referenced by a `BucketRequest` and containing connection information and metadata for a backend bucket. | |||
+ _Bucket (Bucket instance)_ - a cluster-scoped custom resource referenced by a `BucketRequest` and containing connection information and metadata for a backend bucket. | |||
+ _BucketAccess (BA)_ - a cluster-scoped custom resource for granting bucket access. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above says "Bucket access lifecycle management is not within the scope of this KEP" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a mistake. We'll fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see that pushed - am I just impatient?
+ _BucketAccess (BA)_ - a cluster-scoped custom resource for granting bucket access. | ||
+ _BucketAccessClass (BAC)_ - a cluster-scoped custom resource containing fields defining the provisioner and a ConfigMap reference where policy is defined. | ||
+ _BucketAccessRequest (BAR)_ - a user-namespaced custom resource representing a request for access to an existing bucket. | ||
+ _BucketClass (BC)_ - a cluster-scoped custom resource containing fields defining the provisioner and an immutable parameter set for creating new buckets. | ||
+ _BucketRequest (BR)_ - a user-namespaced custom resource representing a request for a new backend bucket, or access to an existing bucket. | ||
+ _BucketRequest (BR)_ - a user-namespaced custom resource representing a request for a new backend bucket. | ||
+ _COSI_ - Container _Object_ Store Interface, modeled after CSI. | ||
+ _cosi-node-adapter_ - a pod per node which receives Kubelet gRPC _NodeStageVolume_ and _NodeUnstageVolume_ requests, ensures the target bucket has been provisioned, and notifies the kubelet when the pod can be run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like overkill. As I read this I can't help wonder why the pod can't just retry? It's not like it needs to be mounted across mount namespaces. Failure is obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a mechanism to delay to pod from starting until the bucket is available
- it improves the overall user experience
- It puts all the information in one place for the pod to utilize, instead of expecting the pod to figure out how to collect credentials and bucket information for a COSI provisioned bucket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "delay" semantic is dubious - apps already have to handle failures, so this gives them a false sense of stability.
The accumulation of coordinates and credentials is a reasonable answer, but this feels like a LOT of machinery for that. If I were trying to reduce scope I'd maybe suggest that the BucketAccess contain the name of the secret which COSI will populate with well-known keys, and tell the app to use the existing secrets mechanisms. Am I still over-reducing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on slack: I think all the mechanics you need exist in secrets already.
I create a BAR and set secretName: foo
. I run my pod with a secret volume for foo. Kubelet sees that foo doesn't exist and waits until it does. COSI waits for the bucket, then creates foo. Kubelet can now run the pod. My pod gets the bucket info.
You can rip out 50% of this KEP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and if not, you can open a followup KEP to add CSI integration :)
|
||
```yaml | ||
apiVersion: cosi.io/v1alpha1 | ||
apiVersion: objectstorage.k8s.io/v1alpha1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object.storage.k8s.io ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a suggestion - up to SIG to govern the sub-space
status: | ||
bucketAvailable: [8] | ||
bucketName: [5] | ||
bucketAvailable: [6] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be an enum-string of status, including Available and Provisioing and "" (or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is all the sorts of nits I have, I won't block the KEP for it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this fixed?
status: | ||
bucketAvailable: [8] | ||
bucketName: [5] | ||
bucketAvailable: [6] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this in spec? Anything in spec can be set by the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nor this?
- cosi.io/finalizer [3] | ||
objectstorage.k8s.io/provisioner: [2] | ||
finalizers: [3] | ||
- objectstorage.k8s.io/bar-protect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above about this pattern
``` | ||
|
||
1. `name`: when created by COSI, the `BucketAccess` name is generated in this format: _"ba-"+uuid_. If an admin creates a `BucketAccess`, as is necessary for driverless access, they can use any name. The uuid is unique within a cluster. | ||
1. `name`: generated in this format: _"ba-"+uuid_. The uuid is unique within a cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we speccing this so precisely? Just say "is unique" - we have generateName for a reason.
``` | ||
|
||
1. `provisioner`: (optional) the name of the driver that `BucketAccess` instances should be managed by. If empty then there is no driver/sidecar, thus the admin had to create the `Bucket` and `BucketAccess` instances. | ||
1. `policyActionsConfigMap`: (required) a reference to a ConfigMap that contains a set of provisioner/platform-defined policy actions for bucket access. It is seen as typical that this config map's namespace is the same as for the provisioner. In othe words, a namespace that has locked down RBAC rules or prevent modification of this config map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we are out-of-lining this content to a config map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user experience with config maps is better because the access specification format for different protocols differ, and config map data is just much more readable as compared to say, an escaped string or base64 encoded field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it better to pop out to a cConfigMap than to just include a list of values here? They are both YAML in the end, but now the user has to create ANOTHER object (and clean it up one day) and the poor slob who has to figure out how your cluster works after you quit gets to bounce around between types.
Also a configmap is a map - this doesn't seem to need the keys at all? []string
seems more appropriate to me?
Here are the user stories/workflows that address the above two issues: https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1979-object-storage-support/README.md#workflows
The brownfield concept exists only to allow buckets created out of band to be made available via COSI wiring. In the long run, we expect all buckets to be Greenfield.
The brownfield concept is not primarily a admin created bucket. It is a representation of a pre-existing bucket.
The YAML will remain the exact same even when you move across clusters
Buckets are as safe as RBAC will allows us to be. We designed COSI to give workloads access to all buckets in the same namespace. |
those are excruciatingly detailed in the technical dimension and nearly devoid
Well that's not the answer I expected. The git-ops thing is front-of-mind for Specifically, this pattern has been done before and it's a persistent sore You seem to have the same pattern with the bucket-instance. It's a pet. But
Understood. But I don't see how the pieces fit - read my next comment below.
Explain to me how a user creates a git repo of an app and a bucket, checks that
I mean that the doc doesn't show me how I can't just BAR my way into your |
privateKeyName: | ||
projectId: | ||
serviceAccount: | ||
parameters: [11] | ||
status: | ||
bucketAvailable: [12] | ||
bucketID: [13] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this documented below? Did I miss it?
Thanks to all for your reviews. I am closing this PR for now. Likely a new PR will draw on the good suggestions from this attempt. |
Updates to API and gRPC spec to keep KEP current with the code.
cc @wlan0 @brahmaroutu @krishchow