Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bucket API] update KEP to reflect current API and rpc spec #2349

Closed
wants to merge 2 commits into from

Conversation

jeffvance
Copy link
Contributor

@jeffvance jeffvance commented Jan 27, 2021

Updates to API and gRPC spec to keep KEP current with the code.

cc @wlan0 @brahmaroutu @krishchow

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please log a ticket with the Linux Foundation Helpdesk: https://support.linuxfoundation.org/
  • Should you encounter any issues with the Linux Foundation Helpdesk, send a message to the backup e-mail support address at: [email protected]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 27, 2021
@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jan 27, 2021
@wlan0
Copy link
Member

wlan0 commented Jan 28, 2021

/assign thockin

@wlan0
Copy link
Member

wlan0 commented Jan 28, 2021

/api-review

@xing-yang xing-yang added the api-review Categorizes an issue or PR as actively needing an API review. label Jan 28, 2021
@thockin
Copy link
Member

thockin commented Jan 28, 2021

Is the request here for a top-to-bottom API review of this proposal?

@wlan0
Copy link
Member

wlan0 commented Jan 28, 2021

Is the request here for a top-to-bottom API review of this proposal?

Yes, essentially. We are proposing to bring this API into Kubernetes as a part of the COSI effort.

@xing-yang xing-yang added this to the v1.21 milestone Jan 28, 2021
@jeffvance jeffvance force-pushed the 2021-kep-update1 branch 2 times, most recently from 9055f4b to 158c47e Compare January 28, 2021 22:02
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jan 28, 2021
@thockin
Copy link
Member

thockin commented Jan 29, 2021

Do you want me to review the API as part of this KEP or through those other links? Or do you want to do a video call and show it to me so I can ask questions in real-time? NOTE: I have not looked at it yet, so I don't know what sort of questions I will have :)

Aside: Hi @wlan0 ! Nice to see you!

@wlan0
Copy link
Member

wlan0 commented Jan 29, 2021

Hi @thockin, It's great to see you again!

I'll follow up with you over email and setup a time for a video call. That'll be the easiest way I think. Looking forward to talking soon!

@wlan0 wlan0 force-pushed the 2021-kep-update1 branch 2 times, most recently from 3ff4196 to bb562df Compare February 3, 2021 06:55
@jeffvance jeffvance changed the title update KEP to reflect current API and rpc spec [Bucket API] update KEP to reflect current API and rpc spec Feb 3, 2021
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get very far yet, but I wanted to send some quick thoughts. I will keep thinking about the "ownership" problem and the ability for users to change things like anon access after creation. To me, that represents a pretty consequential abstraction problem.

More later

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the overall lack of self-service beyond day 0:

Should Bucket be namespaced? Imagine that, instead of being global, I could
create buckets in my own namespace, and then I could manage pretty much every
aspect of them. So if I need to change anonymous access mode I can just do
that. Great for greenfield.

For brownfield it's entirely different, though. It's not clear to me whether
you envision brownfield as "I am taking ownership of a bucket" or "I am
accessing a bucket that is owned by someone else". If the former, the
greenfield case can still kind of play out - it's mine now. If the latter,
maybe there's a distinction between Bucket and BucketReference? This is suggested in large part by the distinct sets of fields in BucketRequest

The problem with self-service is policy. Am I even allowed to change the
anonymous access mode? So not having self-service wins the "how little can I
do" test, but I am not convinced you won't eventually need some self-service,
and so I think we need to think about it.

What we should NOT do is invent a whole new policy and auth construct.

I'll do some API review, but I have some pretty deep questions, obviously. I am not sure if I should just give you the go-ahead to alpha and try this (since it is all out of k/k) or if I should push for deeper discussion (which I know is not the first round of discussion)

I can't inline comment on the interesting parts (yay github) since they are not
under edit in this PR, so I will do it here.

~L172:

Since there is a 1:1 mapping between a BR and a Bucket, when multiple BRs request access to the same backend bucket, each separate Bucket points to the same backend bucket.

If I interpret this right, it indicates the "I am taking ownership of a bucket"
model above, right?

~L228:

bucketRequest: (set by COSI for greenfield, and ignored for brownfield)

If it is ignored for brownfield, and you assume a 1:1 mapping (see above) how
do you enforce that a user does not create N BRs that all point to the same
Bucket?

~L229:

allowedNamespaces: a list of namespaces that are permitted to either create new buckets or to access existing buckets.

How does a field on a bucket govern creating new buckets?

~L245:

NOTE: the BucketClass object is immutable except for the field isDefaultBucketClass

Reiterate suggestion to extract this from the type and make it an annotation,
like the others

~L311:

bucketAccessName: name of the bound cluster-scoped BucketAccess instance. Set by COSI.

What stops a user from setting this themselves? Is that a security risk?

~L348:

Contents of the ConfigMap that the policyActionsConfigMap field in the BucketAccessClass refers to. A sample value of this field could look like:

YOu are showing a list example, not a map - what are they keys? Is this a map
type, a lsit type, or a single giant string?

~L378:

policyActionsConfigMap: (required) a reference to a ConfigMap

This is starting to feel very baroque. Why do I need a ConfigMap? Why can't I
embed the string-string map directly into the class?

The config map itself is mutable - if I change that, will controllers rectify
and apply the changes to all existing BAs? I think the complexity here is a
bit overwhelming - I'm not sure I could draw you the constellation of resources
needed to get access to a bucket. I'd suggest to see what we can do to reduce
moving parts - is there any denormalization that can be done? You describe a
lot of 1:1 relationships...

keps/sig-storage/1979-object-storage-support/README.md Outdated Show resolved Hide resolved
keps/sig-storage/1979-object-storage-support/README.md Outdated Show resolved Hide resolved
keps/sig-storage/1979-object-storage-support/README.md Outdated Show resolved Hide resolved
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written (apiVersion: cosi.io/v1alpha1) you don't owe me an API review. If you intend to use *.k8s.io, then I am very wary of approving a KEP that needs (or may need) such significant change. From your email @wlan0, it sounds like you have some answers but they are not in here yet.

As such, I can't sign off on it as k8s.io. I suggest you proceed with it as cosi.io/v1alpha1 for now and when we have confidence in it, we can discuss it as a k8s.io API.

@jeffvance
Copy link
Contributor Author

jeffvance commented Feb 9, 2021

@thockin

Regarding the overall lack of self-service beyond day 0:

Should Bucket be namespaced?... It's not clear to me whether you envision brownfield as "I am taking ownership of a bucket" or "I am
accessing a bucket that is owned by someone else". If the former, the greenfield case can still kind of play out - it's mine now. If the latter,
maybe there's a distinction between Bucket and BucketReference? This is suggested in large part by the distinct sets of fields in BucketRequest

Although not explicitly stated, I am assuming the later. There could be a use case for the former but I don't think that was given serious consideration.

... not having self-service wins the "how little can I do" test, but I am not convinced you won't eventually need some self-service,

I agree. Even just more closely considering Bucket mutation reveals challenges in our 1:1 mapping of BR to B, where if B1 is changed and B2-4 are clones of B1, do we reflect the changes in B2-4? Is there a primary B and the clones are not mutable?
Self-service adds another dimension to the mutation issue.

What we should NOT do is invent a whole new policy and auth construct.

Agree!
...

~L172:
"Since there is a 1:1 mapping between a BR and a Bucket, when multiple BRs request access to the same backend bucket, each separate Bucket points to the same backend bucket."
If I interpret this right, it indicates the "I am taking ownership of a bucket" model above, right?

Well, maybe? We haven't defined the concept of a main B vs. cloned B. A brownfield BR, meaning it references an existing B, does not imply ownership IMO. What is your reasoning here?

~L228:
"bucketRequest: (set by COSI for greenfield, and ignored for brownfield)"
If it is ignored for brownfield, and you assume a 1:1 mapping (see above) how do you enforce that a user does not create N BRs that all point to the same Bucket?

I'll remove the phrase in parenthesis since it is misleading. What I was trying to describe is that COSI sets B.spec.bucketRequest when it creates a new B on behalf of a greenfield BR. But the admin does not need to populate B.spec.bucketRequest when she creates a brownfield B.

~L229:
"allowedNamespaces: a list of namespaces that are permitted to either create new buckets or to access existing buckets."
How does a field on a bucket govern creating new buckets?

It restricts namespaces a BR can be created or accessed from. It is an property that should be mutable. The idea was to make it easier and more flexible for the admin to control which projects (ns) can create new buckets. But, maybe RBAC can do this and we should not define another policy related field.

~L245:
"NOTE: the BucketClass object is immutable except for the field isDefaultBucketClass"
Reiterate suggestion to extract this from the type and make it an annotation, like the others

Agree. I do not like the idea of a mutable field not inside a spec.

~L311:
"bucketAccessName: name of the bound cluster-scoped BucketAccess instance. Set by COSI."
What stops a user from setting this themselves? Is that a security risk?

We expect that user are denied access to cluster-scoped resources, like the B and BA. If a user learned the name of an existing BA could they "bind" it in their BAR? If we have a BAR admission controller we can fail the create or empty the field. If it is expected to only be set by COSI we should move it to status.

~L348:
"Contents of the ConfigMap that the policyActionsConfigMap field in the BucketAccessClass refers to. A sample value of this field could look like:"
You are showing a list example, not a map - what are they keys? Is this a map type, a list type, or a single giant string?

It is a json blob whose format is known to the driver.

~L378:
"policyActionsConfigMap: (required) a reference to a ConfigMap"
This is starting to feel very baroque. Why do I need a ConfigMap? Why can't I embed the string-string map directly into the class?

Our thinking is that the values may be structured strings, like JSON, which do not fit well in a string-string map. A ConfigMap eliminates the need to worry about values and possibly is shareable among multiple BACs.

The config map itself is mutable - if I change that, will controllers rectify and apply the changes to all existing BAs? I think the complexity here is a bit overwhelming - I'm not sure I could draw you the constellation of resources needed to get access to a bucket. I'd suggest to see what we can do to reduce moving parts - is there any denormalization that can be done? You describe a lot of 1:1 relationships...

@thockin
Copy link
Member

thockin commented Feb 9, 2021

Even just more closely considering Bucket mutation reveals challenges in our 1:1 mapping of BR to B, where if B1 is changed and B2-4 are clones of B1, do we reflect the changes in B2-4? Is there a primary B and the clones are not mutable?

Excellent point.

If I interpret this right, it indicates the "I am taking ownership of a bucket" model above, right?

Well, maybe? We haven't defined the concept of a main B vs. cloned B. A brownfield BR, meaning it references an existing B, does not imply ownership IMO. What is your reasoning here?

I'm going to say k8s-bucketto mean the k8s resource and s3-bucket
to mean the underlying storage bucket in whatever storage system.

If 1 BucketRequest maps to 1 k8s-Bucket, then it feels like that BR "owns" that
k8s-bucket. But as you point out, a single s3-bucket can be pointed to by
multiple k8s-buckets. That really says "borrowed". not owned. The problem I
have with this is that you still have fields like anonymousAccessMode and
retentionPolicy at the k8s-bucket level. (which, to be fair, you just said too
:)

This feels like it's not going to work, or it's going to end in fireworks (not
the good kind).

If there's going to be ANY sort of access-policy in the API, the data model
should not leave room for ambiguity.

This is still my primary concern. There's not a choke-point for ownership and this
green/brown approach tries to shoe-horn both owned and loaned buckets in the
same API. I think you need a single canonical owner resource for each
s3-bucket, and then N references to that.

E.g. Make k8s-bucket a 1:1 relationship with s3-bucket. The k8s-bucket
resource owns access control and lifecycle (including a notion of "retain upon
delete", like PV). Then have N BucketUser resources that reference the Bucket
(but do not own it). If you make Bucket namespaced, then you get to offer
self-service (user can create and own a Bucket) and central ownership (user can
just indicate desire to use an extant bucket). You have to deal with how a
user can be prevented from claiming any random bucket they want, and so on.

This feels, with only 10 minutes of thought, cleaner and less ambiguous? Don't
take my ideas per se, just seeds for modelling.

~L228:
"bucketRequest: (set by COSI for greenfield, and ignored for brownfield)"
If it is ignored for brownfield, and you assume a 1:1 mapping (see above) how do you enforce that a user does not create N BRs that all point to the same Bucket?

I'll remove the phrase in parenthesis since it is misleading. What I was trying to describe is that COSI sets B.spec.bucketRequest when it creates a new B on behalf of a greenfield BR. But the admin does not need to populate B.spec.bucketRequest when she creates a brownfield B.

I'm still confused. You assert elsewhere it's 1:1 BR:B but AFAICT nothing
enforces that, which means it will end up N:1 BR:B.

~L229:
"allowedNamespaces: a list of namespaces that are permitted to either create new buckets or to access existing buckets."
How does a field on a bucket govern creating new buckets?

It restricts namespaces a BR can be created or accessed from. It is an property that should be mutable. The idea was to make it easier and more flexible for the admin to control which projects (ns) can create new buckets. But, maybe RBAC can do this and we should not define another policy related field.

I'm still confused. This is Bucket.allowedNamespaces. I can see how it
limits the namespaces that are allowed to BR-bind to this particular bucket
(though you should really have a namespace selector now that we will have
magic labels for namespace name).

I don't see how "namespaces that are permitted to ... create new buckets" is
related to a field on a single Bucket. Should this say "bucket request" ?

~L311:
"bucketAccessName: name of the bound cluster-scoped BucketAccess instance. Set by COSI."
What stops a user from setting this themselves? Is that a security risk?

We expect that user are denied access to cluster-scoped resources, like the B and BA. If a user learned the name of an existing BA could they "bind" it in their BAR? If we have a BAR admission controller we can fail the create or empty the field. If it is expected to only be set by COSI we should move it to status.

THis field is BucketAccessRequest.bucketAccessName. YOu have to assume
that users can write to all fields of their spec. There's a debate on sig-arch
right now (spawned from reading this KEP :) about how to handle bindings and
allocations safely. A big part of PV/PVC's ballet is this.

~L348:
"Contents of the ConfigMap that the policyActionsConfigMap field in the BucketAccessClass refers to. A sample value of this field could look like:"
You are showing a list example, not a map - what are they keys? Is this a map type, a list type, or a single giant string?

It is a json blob whose format is known to the driver.

What I mean is that a ConfigMap is (duh) a map, but the example shown here is
either a list of strings or a single string, encoding a list of structs. But
it is not a map. So either the "sample value" is the value of a single map key
in the CM or it is not representative of a real CM. Does that question make
sense? Since you didn't put a type here, I have to guess. If you really read
the value from a CM it would look like

{
  "key": "[{\"Effect”:\"Allow”,\"Action”:\"s3:PutObject”,\"Resource”:\"arn:aws:s3:::profilepics/*\"}, {\"Effect”:\"Allow”,\"Action”:\"s3:GetObject”,\"Resource”:\"arn:aws:s3:::profilepics/*\"}, {\"Effect”:\"Deny”,\"Action”:\"s3:*\",”NotResource”:\"arn:aws:s3:::profilepics/*\"}]"

@jeffvance
Copy link
Contributor Author

@thockin
...

I'm going to say k8s-bucket to mean the k8s resource and s3-bucket to mean the underlying storage bucket in whatever storage system.

If 1 BucketRequest maps to 1 k8s-Bucket, then it feels like that BR "owns" that k8s-bucket. But as you point out, a single s3-bucket can be pointed to by multiple k8s-buckets. That really says "borrowed", not owned. The problem I have with this is that you still have fields like anonymousAccessMode and retentionPolicy at the k8s-bucket level. (which, to be fair, you just said too :)

I like the idea of one k8s Bucket abstracting one backend bucket (in fact we have proposed this in way past drafts). Then altering bucket properties boils down to editing the matching k8s bucket instance. No clones. No propagating changes. Single source of truth (within k8s).

A user needs to be able to trigger the creation of a k8s Bucket and the BucketRequest serves this purpose. A backend bucket can be shared and this is abstracted via BucketAccess instances. BA creation is triggered by a user's BucketAccessRequest.
It seems for brownfield we don't need BRs (just the BAR), except we don't want to require the user to know the name of the cosi created k8s Bucket since its name is random. Thus, the BAR refs the BR instead of the k8s Bucket, resulting in the need of a BR and a BAR for brownfield.

will add more tomorrow...

This feels like it's not going to work, or it's going to end in fireworks (not the good kind). If there's going to be ANY sort of access-policy in the API, the data model should not leave room for ambiguity.

This is still my primary concern. There's not a choke-point for ownership and this green/brown approach tries to shoe-horn both owned and loaned buckets in the same API. I think you need a single canonical owner resource for each s3-bucket, and then N references to that.

E.g. Make k8s-bucket a 1:1 relationship with s3-bucket. The k8s-bucket resource owns access control and lifecycle (including a notion of "retain upon delete", like PV). Then have N BucketUser resources that reference the Bucket (but do not own it). If you make Bucket namespaced, then you get to offer self-service (user can create and own a Bucket) and central ownership (user can just indicate desire to use an extant bucket). You have to deal with how a user can be prevented from claiming any random bucket they want, and so on.

This feels, with only 10 minutes of thought, cleaner and less ambiguous? Don't take my ideas per se, just seeds for modelling.

~L228:
"bucketRequest: (set by COSI for greenfield, and ignored for brownfield)"
If it is ignored for brownfield, and you assume a 1:1 mapping (see above) how do you enforce that a user does not create N BRs that all point to the same Bucket?

...What I was trying to describe is that COSI sets B.spec.bucketRequest when it creates a new B on behalf of a greenfield BR. But the admin does not need to populate B.spec.bucketRequest when she creates a brownfield B.

I'm still confused. You assert elsewhere it's 1:1 BR:B but AFAICT nothing enforces that, which means it will end up N:1 BR:B.

Does cosi detect that a given k8s B is not referenced by more than one BR? AFAIK, today the answer is no, and this question highlights yet another challenge with our 1:1 mapping.

~L229:
"allowedNamespaces: a list of namespaces that are permitted to either create new buckets or to access existing buckets."
How does a field on a bucket govern creating new buckets?

It restricts namespaces a BR can be created or accessed from. It is an property that should be mutable. The idea was to make it easier and more flexible for the admin to control which projects (ns) can create new buckets. But, maybe RBAC can do this and we should not define another policy related field.

I'm still confused. This is Bucket.allowedNamespaces. I can see how it limits the namespaces that are allowed to BR-bind to this particular bucket (though you should really have a namespace selector now that we will have magic labels for namespace name). I don't see how "namespaces that are permitted to ... create new buckets" is related to a field on a single Bucket. Should this say "bucket request" ?

OK I understand now. Bucket.spec.allowedNamespaces is a copy of the bucket class's allowedNamespaces, with the idea being that this field in the k8s B could be mutated to remove or add namespaces. However, this may be a carryover from an earlier draft where we did not have a 1:1 BR:B mapping. Or, allowedNamespaces could be checked by grant-access code for brownfield use cases. The kep is less than clear about this. @wlan0 ?

~L311:
"bucketAccessName: name of the bound cluster-scoped BucketAccess instance. Set by COSI."
What stops a user from setting this themselves? Is that a security risk?

We expect that user are denied access to cluster-scoped resources, like the B and BA. If a user learned the name of an existing BA could they "bind" it in their BAR? If we have a BAR admission controller we can fail the create or empty the field. If it is expected to only be set by COSI we should move it to status.

This field is BucketAccessRequest.bucketAccessName. You have to assume that users can write to all fields of their spec. There's a debate on sig-arch right now (spawned from reading this KEP :) about how to handle bindings and allocations safely. A big part of PV/PVC's ballet is this.

I agree with this assumption. It seems clear that if a binding resource name is for info only then it should reside in status. If this name can be set by the user to cause bindings then it should live in spec and we have risks. I need Sid (@wlan0) to chime in here.

~L348:
"Contents of the ConfigMap that the policyActionsConfigMap field in the BucketAccessClass refers to. A sample value of this field could look like:"
You are showing a list example, not a map - what are they keys? Is this a map type, a list type, or a single giant string?
It is a json blob whose format is known to the driver.

What I mean is that a ConfigMap is (duh) a map, but the example shown here is either a list of strings or a single string, encoding a list of structs. But it is not a map. So either the "sample value" is the value of a single map key in the CM or it is not representative of a real CM. Does that question make sense? Since you didn't put a type here, I have to guess. If you really read the value from a CM it would look like

{
  "key": "[{\"Effect”:\"Allow”,\"Action”:\"s3:PutObject”,\"Resource”:\"arn:aws:s3:::profilepics/*\"}, {\"Effect”:\"Allow”,\"Action”:\"s3:GetObject”,\"Resource”:\"arn:aws:s3:::profilepics/*\"}, {\"Effect”:\"Deny”,\"Action”:\"s3:*\",”NotResource”:\"arn:aws:s3:::profilepics/*\"}]"

I believe the blob is the value of a key not shown in the example, so your k-v pair above is more accurate. But again I need to defer to @wlan0 for his thoughts on this.

@xing-yang
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 11, 2021
@wlan0 wlan0 force-pushed the 2021-kep-update1 branch from f141f35 to 28aa7f8 Compare May 11, 2021 23:52
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 11, 2021
@wlan0
Copy link
Member

wlan0 commented May 11, 2021

@xing-yang @krishchow I've updated the KEP based on your latest comments. PTAL

@xing-yang
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 12, 2021
Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign
for PRR

The questionnaire hasn't been filled out so I cannot review yet.

@wlan0 wlan0 force-pushed the 2021-kep-update1 branch from 28aa7f8 to bb2870d Compare May 12, 2021 21:45
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 12, 2021
@wlan0 wlan0 force-pushed the 2021-kep-update1 branch from bb2870d to 7d2d3bf Compare May 12, 2021 21:49
@wlan0 wlan0 force-pushed the 2021-kep-update1 branch from 7d2d3bf to 1f65546 Compare May 12, 2021 21:56
Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. From my reading, this feature sounds like it is 100% out of tree. If that's the case, PRR is optional. Please confirm.


### Upgrade / Downgrade Strategy

No changes are required on upgrade to maintain previous behaviour.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upgrade/downgrade in this case I believe would refer to the addition or removal of the CRDs to the cluster so the API will support them. Can you talk a little about how that is intended to work?


### Version Skew Strategy

COSI is out-of-tree, so version skew strategy is N/A
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a node adapter component, does that have any dependencies on a given kubelet version?


Yes. Delete the resources created when installing COSI

###### What happens if we reenable the feature if it was previously rolled back?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question should be answered.

conversion tests if API types are being modified.
-->

N/A since we are only targeting alpha for this Kubernetes release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is referring to a test of your rollout/rollback strategy (i.e. create the resources, delete the resources). This testing is applicable.


No

### Scalability
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is encouraged for alpha.

@wlan0
Copy link
Member

wlan0 commented May 12, 2021

Left a few comments. From my reading, this feature sounds like it is 100% out of tree. If that's the case, PRR is optional. Please confirm.

@ehashman Yes, this is 100% out-of-tree

Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
for PRR

PRR optional, feature is out of tree.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bswartz, ehashman, jeffvance
To complete the pull request process, please ask for approval from saad-ali after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@xing-yang
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 13, 2021
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First: this was not a super-complete API review. If we proceed with this, we should do it in more depth.

The user-stories are pretty vague, especially the admin one. This is a very expansive API, which largely puts the cluster admin into the middle. There has to be more detail about what admin things are the requirements. I can derive requirements from the API, but I can't tell if they are real or not.

What this is really missing most is a handful of complete examples that speak to the user-stories. As an admin, show me EXACTLY the yaml I would write to enable GCS or S3 as class, As an end-user, show me EXACTLY the yaml I would write to request a greenfield bucket and to attach to a brownfield bucket. etc.

I have to admit, this model still feels very weird to me. The frequent "does not apply to brownfield" sorts of statements and bucket-or-bucketRequest semantics feel like we're jamming two ideas into one API.

I'm still confused about the duality of green/brown. This seems hostile to things like git-ops. I should be able to check my config into git, and then re-apply it should something go wrong. In this world, if I want to move between clusters, my YAML has to change?

I don't really know how to convince myself this is "safe". How do I not end up with users stealing access to buckets that are not theirs? Examples showing those checks and blocks would be super helpful.

I know this is a lot of comments - some small and some not. I'll try to catch up with you all tomorrow.

@@ -110,21 +123,20 @@ When a workload (app pod, deployment, configs) is moved to another cluster, as l

+ _backend bucket_ - any bucket provided by the object store system, completely separate from Kubernetes.
+ _brownfield bucket_ - an existing backend bucket.
+ _Bucket (Bucket Instance)_ - a cluster-scoped custom resource referenced by a `BucketRequest` and containing connection information and metadata for a backend bucket.
+ _Bucket (Bucket instance)_ - a cluster-scoped custom resource referenced by a `BucketRequest` and containing connection information and metadata for a backend bucket.
+ _BucketAccess (BA)_ - a cluster-scoped custom resource for granting bucket access.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above says "Bucket access lifecycle management is not within the scope of this KEP" ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a mistake. We'll fix it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see that pushed - am I just impatient?

+ _BucketAccess (BA)_ - a cluster-scoped custom resource for granting bucket access.
+ _BucketAccessClass (BAC)_ - a cluster-scoped custom resource containing fields defining the provisioner and a ConfigMap reference where policy is defined.
+ _BucketAccessRequest (BAR)_ - a user-namespaced custom resource representing a request for access to an existing bucket.
+ _BucketClass (BC)_ - a cluster-scoped custom resource containing fields defining the provisioner and an immutable parameter set for creating new buckets.
+ _BucketRequest (BR)_ - a user-namespaced custom resource representing a request for a new backend bucket, or access to an existing bucket.
+ _BucketRequest (BR)_ - a user-namespaced custom resource representing a request for a new backend bucket.
+ _COSI_ - Container _Object_ Store Interface, modeled after CSI.
+ _cosi-node-adapter_ - a pod per node which receives Kubelet gRPC _NodeStageVolume_ and _NodeUnstageVolume_ requests, ensures the target bucket has been provisioned, and notifies the kubelet when the pod can be run.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like overkill. As I read this I can't help wonder why the pod can't just retry? It's not like it needs to be mounted across mount namespaces. Failure is obvious.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a mechanism to delay to pod from starting until the bucket is available

  • it improves the overall user experience
  • It puts all the information in one place for the pod to utilize, instead of expecting the pod to figure out how to collect credentials and bucket information for a COSI provisioned bucket

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "delay" semantic is dubious - apps already have to handle failures, so this gives them a false sense of stability.

The accumulation of coordinates and credentials is a reasonable answer, but this feels like a LOT of machinery for that. If I were trying to reduce scope I'd maybe suggest that the BucketAccess contain the name of the secret which COSI will populate with well-known keys, and tell the app to use the existing secrets mechanisms. Am I still over-reducing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on slack: I think all the mechanics you need exist in secrets already.

I create a BAR and set secretName: foo. I run my pod with a secret volume for foo. Kubelet sees that foo doesn't exist and waits until it does. COSI waits for the bucket, then creates foo. Kubelet can now run the pod. My pod gets the bucket info.

You can rip out 50% of this KEP

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if not, you can open a followup KEP to add CSI integration :)


```yaml
apiVersion: cosi.io/v1alpha1
apiVersion: objectstorage.k8s.io/v1alpha1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object.storage.k8s.io ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a suggestion - up to SIG to govern the sub-space

status:
bucketAvailable: [8]
bucketName: [5]
bucketAvailable: [6]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an enum-string of status, including Available and Provisioing and "" (or something

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is all the sorts of nits I have, I won't block the KEP for it :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this fixed?

status:
bucketAvailable: [8]
bucketName: [5]
bucketAvailable: [6]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in spec? Anything in spec can be set by the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nor this?

- cosi.io/finalizer [3]
objectstorage.k8s.io/provisioner: [2]
finalizers: [3]
- objectstorage.k8s.io/bar-protect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above about this pattern

```

1. `name`: when created by COSI, the `BucketAccess` name is generated in this format: _"ba-"+uuid_. If an admin creates a `BucketAccess`, as is necessary for driverless access, they can use any name. The uuid is unique within a cluster.
1. `name`: generated in this format: _"ba-"+uuid_. The uuid is unique within a cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we speccing this so precisely? Just say "is unique" - we have generateName for a reason.

```

1. `provisioner`: (optional) the name of the driver that `BucketAccess` instances should be managed by. If empty then there is no driver/sidecar, thus the admin had to create the `Bucket` and `BucketAccess` instances.
1. `policyActionsConfigMap`: (required) a reference to a ConfigMap that contains a set of provisioner/platform-defined policy actions for bucket access. It is seen as typical that this config map's namespace is the same as for the provisioner. In othe words, a namespace that has locked down RBAC rules or prevent modification of this config map.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we are out-of-lining this content to a config map?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user experience with config maps is better because the access specification format for different protocols differ, and config map data is just much more readable as compared to say, an escaped string or base64 encoded field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it better to pop out to a cConfigMap than to just include a list of values here? They are both YAML in the end, but now the user has to create ANOTHER object (and clean it up one day) and the poor slob who has to figure out how your cluster works after you quit gets to bounce around between types.

Also a configmap is a map - this doesn't seem to need the keys at all? []string seems more appropriate to me?

@wlan0
Copy link
Member

wlan0 commented May 13, 2021

The user-stories are pretty vague, especially the admin one. This is a very expansive API, which largely puts the cluster admin into the middle. There has to be more detail about what admin things are the requirements. I can derive requirements from the API, but I can't tell if they are real or not.

What this is really missing most is a handful of complete examples that speak to the user-stories. As an admin, show me EXACTLY the yaml I would write to enable GCS or S3 as class, As an end-user, show me EXACTLY the yaml I would write to request a greenfield bucket and to attach to a brownfield bucket. etc.

Here are the user stories/workflows that address the above two issues: https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1979-object-storage-support/README.md#workflows

I have to admit, this model still feels very weird to me. The frequent "does not apply to brownfield" sorts of statements and bucket-or-bucketRequest semantics feel like we're jamming two ideas into one API.

I'm still confused about the duality of green/brown.

The brownfield concept exists only to allow buckets created out of band to be made available via COSI wiring. In the long run, we expect all buckets to be Greenfield.

I should be able to check my config into git, and then re-apply it should something go wrong. This seems hostile to things like git-ops. In this world,

The brownfield concept is not primarily a admin created bucket. It is a representation of a pre-existing bucket.

if I want to move between clusters, my YAML has to change?

The YAML will remain the exact same even when you move across clusters

I don't really know how to convince myself this is "safe". How do I not end up with users stealing access to buckets that are not theirs? Examples showing those checks and blocks would be super helpful.

Buckets are as safe as RBAC will allows us to be. We designed COSI to give workloads access to all buckets in the same namespace.

@thockin
Copy link
Member

thockin commented May 13, 2021

The user-stories are pretty vague, especially the admin one. This is a very expansive API, which largely puts the cluster admin into the middle. There has to be more detail about what admin things are the requirements. I can derive requirements from the API, but I can't tell if they are real or not.

What this is really missing most is a handful of complete examples that speak to the user-stories. As an admin, show me EXACTLY the yaml I would write to enable GCS or S3 as class, As an end-user, show me EXACTLY the yaml I would write to request a greenfield bucket and to attach to a brownfield bucket. etc.

Here are the user stories/workflows that address the above two issues: https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1979-object-storage-support/README.md#workflows

those are excruciatingly detailed in the technical dimension and nearly devoid
of the product dimension. What problem is the user expressing, and what hoops
are we making them jump thru to solve it?

I have to admit, this model still feels very weird to me. The frequent "does not apply to brownfield" sorts of statements and bucket-or-bucketRequest semantics feel like we're jamming two ideas into one API.

I'm still confused about the duality of green/brown.

The brownfield concept exists only to allow buckets created out of band to be made available via COSI wiring. In the long run, we expect all buckets to be Greenfield.

Well that's not the answer I expected. The git-ops thing is front-of-mind for
me. Do I commit by BucketRequest to git? Or just the BAR?

Specifically, this pattern has been done before and it's a persistent sore
spot. I create a service, and I get allocated a load-balancer IP. Then my
cluster explodes and I restore from git, and I get a DIFFERENT address. Oh
crap. So we allow users to specify their own IP in the Service. But they
don't know that IP a priori. So instead the pattern is create an LB, get an
IP, update the service with that IP, check that into git. Puke.

You seem to have the same pattern with the bucket-instance. It's a pet. But
now your pet is a whole resource.

I should be able to check my config into git, and then re-apply it should something go wrong. This seems hostile to things like git-ops. In this world,

The brownfield concept is not primarily a admin created bucket. It is a representation of a pre-existing bucket.

Understood. But I don't see how the pieces fit - read my next comment below.

if I want to move between clusters, my YAML has to change?

The YAML will remain the exact same even when you move across clusters

Explain to me how a user creates a git repo of an app and a bucket, checks that
into git, applies it to a cluster, tears down that cluster and brings up a new
one, and then applies their config again - without throwing away all of their
data?

I don't really know how to convince myself this is "safe". How do I not end up with users stealing access to buckets that are not theirs? Examples showing those checks and blocks would be super helpful.

Buckets are as safe as RBAC will allows us to be. We designed COSI to give workloads access to all buckets in the same namespace.

I mean that the doc doesn't show me how I can't just BAR my way into your
buckets. This goes back to my older comment about Bucket being namespaced.
The ownership is fuzzy to me.

privateKeyName:
projectId:
serviceAccount:
parameters: [11]
status:
bucketAvailable: [12]
bucketID: [13]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this documented below? Did I miss it?

@jeffvance
Copy link
Contributor Author

Thanks to all for your reviews. I am closing this PR for now. Likely a new PR will draw on the good suggestions from this attempt.

@jeffvance jeffvance closed this Jun 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.