-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ExecutionHook CRDs API Types #5
Implement ExecutionHook CRDs API Types #5
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
dade382
to
839f8b9
Compare
/assign @xing-yang |
/approve |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ashish-amarnath, prydonius The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
bbd0220
to
fa8ef76
Compare
|
||
var ( | ||
// GroupVersion is group version used to register these objects | ||
GroupVersion = schema.GroupVersion{Group: "apps.k8s.io", Version: "v1alpha1"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm checking whether it should be apps.k8s.io or app.k8s.io to be consistent with other CRDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Quick pointer, the KEP also referred to this as apiVersion: apps.k8s.io/v1alpha1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, apps.k8s.io looks right. Just want to double check as I found "app.k8s.io" is used for other CRDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is "app.k8s.io" here: https://github.com/kubernetes-sigs/application/blob/master/pkg/apis/app/v1beta1/register.go#L34.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is apps.k8s.io
correct or should we change to app.k8s.io
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that's why "app.k8s.io" is used for "Application" CRD, not "apps.k8s.io"? I don't know the reason, hence raising the question. Since this ExecutionHook repo belongs to sig-apps, I thought it would be better to stay in app.k8s.io for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, perhaps. Note that a SIG can and should have more than one API group, API groups are more fine grained than SIGs.
Are you sure apps is the right SIG for this and not node? This seems more connected to me with pods and runtimes than with replicasets and deployments. The existing exec
feature is SIG node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is under sig-apps: https://github.com/kubernetes/community/blob/master/sigs.yaml#L194. The KEP was initially done in sig-storage. @saad-ali suggested that this should be in sig-apps because we need application specific knowledge to use the execution hooks to quiesce and unquiesce an application. Of course we should get input from sig-node as well. I believe there was someone from sig-node who reviewed the KEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, just checking. lifecycle.apps.k8s.io
might be a good group name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good!
bc71ee4
to
35cd833
Compare
35cd833
to
5143e1b
Compare
Signed-off-by: Ashish Amarnath <[email protected]>
606932c
to
cdd9db1
Compare
// Any action may be triggered more than once but only the latest results will be logged in status. | ||
// As alpha feature, only ExecAction type in Handler will be support, not the HTTPGETAction or TCPSocketAction. | ||
// This is required. | ||
Action corev1.Handler `json:"action"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add protobuf part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Kubernetes clients for CRDs, the protbuf serialization tags are not used. Protobuf encoding are used in kubernetes only for internal types, mainly for efficiency. This is not extended for CRDs. For that reason, I have removed all protobuf tags.
Also See: kubernetes/kubernetes#55541
cdd9db1
to
a8e96b6
Compare
/assign @thockin @lavalamp |
8b269e1
to
c09f99c
Compare
Signed-off-by: Ashish Amarnath <[email protected]>
c09f99c
to
1908d09
Compare
// PodContainerNames lists the containers the ExecutionHook should be executed on in a Pod. | ||
type PodContainerNames struct { | ||
// This field is required | ||
PodName string `json:"podName"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make a lot of sense to me, if your selector identifies e.g. all pods in a replica set, this still makes you specify each one by name (since the name has a random suffix).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one of "PodContainerNames" and "PodContainerSelector" will be specified but not both. So if the selector identifies all pods in the PodContainerSelector, this PodContainerNames won't be used. We need to add validation to make sure only one is specified.
I didn't realize that KEP got to the implementable stage. I recall having made significant feedback on it and I don't think I re-reviewed the changes. |
The PR merging the KEP & setting it to implementable has this comment specifically calling out the API as still needing to be nailed down: kubernetes/enhancements#705 (review) |
Hi @lavalamp , @thockin has approved the KEP for alpha. Before moving to Beta, we'll evaluate whether to move it to kubelet. See here: https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20190120-execution-hook-design.md#risks-and-mitigations |
It is more or less up to you when you get an API review. If you get it for the alpha -> beta transition, fixing whatever comes up may involve a lot more work than if you get it now and address prior to implementation. I don't actually have time to do an API review right now, maybe @thockin does. Anyway, I just want to be clear that the KEP went to Since I'm not sure everyone is on the same page when it comes to API reviews, I'll add: the thing I try to do during API review is to make sure the feature's expression in the API composes well with the rest of the API surface area. This sometimes generates more and larger AI's than e.g. just asking for unions where there should be unions. |
Hi @lavalamp, if I remember correctly, I think Alternative option 1b has incorporated your suggestions during the reviews: https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20190120-execution-hook-design.md#alternative-option-1b. There were many rounds of discussions on whether to do it in kubelet or use a CRD during the design. Eventually we decided to go with a CRD (the current main proposal) for alpha version, implement it and see how it works. Then evaluate it and move it to Kubelet when going from alpha to beta. So if it is okay with you and @thockin, we'd like to stay with this plan. Thanks! |
@xing-yang as long as we all have the same expectation of a somewhat arduous step in the future, that's all that I'm trying to accomplish here :) |
Understood:). |
To prep for this, I pulled up the KEP, and (surprise) I agree with my own comments there. I will try to make a quick pass on this, but I think what I was really hoping for was revisions to the KEP to nail down the API. |
// HookActionName is copied to ExecutionHookSpec by the controller such as | ||
// the Application Snapshot Controller. | ||
type ExecutionHookSpec struct { | ||
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably want to remove meta-comments?
ActionName string `json:"actionName"` | ||
} | ||
|
||
// PodSelection contains two fields, PodContainerNamesList and PodContainerSelector, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This basically copies the type contents into a comment. Don't list the fields here. Just say that this struct represents a one-of, and define the defaults.
One-of types really want a discriminator field, e.g. "type=Selector"
// ExecutionHookSpec defines the desired state of ExecutionHook | ||
// HookActionName is copied to ExecutionHookSpec by the controller such as | ||
// the Application Snapshot Controller. | ||
type ExecutionHookSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called "PodAction" or maybe just "Action" or "Notification"
// asynchronously. If execution ordering is required, caller has to implement the logic and create | ||
// different hooks in order. | ||
// This field is required. | ||
PodSelection PodSelection `json:"podSelection"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I disagree with this structure. I said something about this in the KEP. Here's my thinking.
Ideally (IMO) HookAction eventually becomes a field inside each Pod.spec.containers[] - it defines what to do when that action is triggered. Likewise, this type (ExecutionHook) should be called PodAction or similar. It triggers an action on a set of Pods. It should not be spelling out which container because the action definition itself will be attached to a container.
If/when we carry this into Beta, you'll have to change all of this to a simple pod selector. Better to change it now. This can simply be selector
and be a normal pod selector. I'm not convinced we need specific pod names?
Once this is integrated to Pod, triggering an action "foo" can mean "all action specs named foo on any containers in the pod"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did have this proposed as part of the LifeCycle struct inside a container as described in alternative option 1a and 1b in the KEP. That was actually the main proposal when the KEP was initially submitted. https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20190120-execution-hook-design.md#alternative-option-1a
We also talked about the pros and cons of having kubelet or an external controller take care of these hooks:
https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20190120-execution-hook-design.md#controller-handlings-for-option-1a-and-1b
If we decided to go with alternative 1a/1b or some variation of those, then I don't think we need this execution-hook repo as the handling of the hooks will be either in kubelet and/or in the application snapshot controller (when that's available).
After numerous discussions, however, the decision was to start with an alpha implementation with a hook CRD and an external controller to manage the hooks. This allows us to explore other features such as application snapshot which will use this hook for quiescing. We have stated in the KEP that we'll look at doing it in kubelet before moving to beta.
https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20190120-execution-hook-design.md#risks-and-mitigations
"The security concern is that ExecutionHook controller has the authority to execute commands in any pods. For alpha and proof of concept, we propose to use external controller to handle executionhooks. But to move to beta and graduate as GA, we will evaluate it and move it to kubelet which already has the privilege to execute commands in pod/containers."
// It will be set to false if the action cannot be executed successfully after | ||
// ActionTimeoutSeconds passes. | ||
// +optional | ||
Succeed *bool `json:"succeed,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
success
// If not set, it is nil, indicating Action has not started | ||
// If set, it means Action has started at the specified time | ||
// +optional | ||
Timestamp *metav1.Time `json:"timestamp,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this actually mean? What guarantees can I infer from it?
What happens if I retry the action? Under what circumstances would a retry happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to know how long an action took from trigger to completion? I think we would..
// This is required | ||
ErrorType ErrorType `json:"errorType"` | ||
|
||
// Error message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the result from the action (e.g. exec stderr)? Or is it fixed from the controller?
// +optional | ||
Message *string `json:"message,omitempty"` | ||
|
||
// More detailed reason why error happens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If any of this comes from user pod (e.g. stderr) it should be size-bounded and specified.
PodSelection PodSelection `json:"podSelection"` | ||
|
||
// Name of the HookAction. This is required. | ||
ActionName string `json:"actionName"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to control how fast this happens across a set of pods? E.g. I have a deployment and I want this to wait 10 seconds between each pod?
// +kubebuilder:resource:path=executionhook,shortName=eh,scope=Namespaced,categories=executionhook | ||
// +kubebuilder:subresource:status | ||
|
||
// ExecutionHook is the Schema for the executionhook API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talk about guarantees of execution (at-least-once vs exactly-once) and retries, timeouts, idempotency, etc?
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/lifecycle frozen |
The This bot removes
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /remove-lifecycle frozen |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Implement ExecutionHook CRDs API Types
KEP is at https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20190120-execution-hook-design.md