-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-4292: Add custom profiling in kubectl debug #4293
KEP-4292: Add custom profiling in kubectl debug #4293
Conversation
ardaguclu
commented
Oct 13, 2023
- One-line PR description: Add custom profiling support in kubectl debug
- Issue link: Custom profile in kubectl debug #4292
- Other comments:
6fd4682
to
d1b6afc
Compare
/cc @verb |
I believe this KEP is ready for review for 1.30. |
2d2da9e
to
a370516
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wanted to mention, that i really like the idea. I've already hacked a bit in kubectl
to have volumeMounts
in my ephemeral Containers.
I've also started with an inital version of an own kubectl
plugin to have some kind of "profiles" (which can be shared within my team/community) to have dedicated debug profiles for different kind of applications.
My current POC make use of a config-file where i've defined multiple "profiles" which i
can refer to when i want to debug a container:
profiles:
application-1:
volumeMounts: # if nil, use all volume mounts from target pod
- name: database
mountPath: /opt/database
container: server
image: my-special-debug-container:v0.0.4
application-2:
volumeMounts: {}
container: sidecar
image: another-debug-container:latest
|
||
As a restricted user, I'd be able to debug a pod to simulate the exact environment | ||
of the problematic pod which has resource requests and limits. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's also worth to mention the ability to have an ephemeral Container beside a container, build from scratch/distroless/<container without shell>
where not all the handy tools (curl
, wget
, openssl
, ...) are installed.
a370516
to
944d657
Compare
rollout. Similarly, consider large clusters and how enablement/disablement | ||
will rollout across nodes. | ||
--> | ||
No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit more details, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Please let me know what you think.
checking if there are objects with field X set) may be a last resort. Avoid | ||
logs or events for this purpose. | ||
--> | ||
By checking the value of `KUBECTL_DEBUG_CUSTOM_PROFILE` environment variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt the cluster operator can find that useful. It's more of a user setting, than a cluster one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add a note, that this command uses GA-ed APIs, so it'll be hard to distinguish between this command invocations and not.
/label tide/merge-method-squash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
from sig-cli pov
Copying pod and node debugging use built-in API endpoints. Ephemeral container functionality was | ||
promoted to stable in 1.25. Besides custom profiling feature only relies on | ||
`corev1.Container` type's existence differently from `kubectl debug` which is also built-in. | ||
It is zero probability that this feature touches unavailable API endpoints in regard to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'd stay way from zero probability, but rather use something lighter, like very low probability, low chances, etc.
1.25 is not that old, and based on the results from past PRR surveys show that a lot of users still keep older versions of k8s. So it's possible that someone might use older than 1.25 k8s, just very small 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed you are right. I'll update this statement based on your suggestions. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[PRR shadow]
PRR looks OK to alpha, left some minor non-blocking comments.
- Condition name: | ||
- Other field: | ||
- [X] Other (treat as last resort) | ||
- Details: Checking the value of `KUBECTL_DEBUG_CUSTOM_PROFILE` environment variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can run a test debug and see if their profile is respected in the resulting container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review @BenTheElder. I liked your comment above ^ and have added it to this section.
You can take a look at one potential example of such test in: | ||
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282 | ||
--> | ||
In alpha phase, all the tests can be done only after enabling this feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This answer sort of reads like it missed the question (?), but this feature doesn't change behavior of stored objects anyhow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, updated this answer to more align with the question. Please let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PRR looks good, feature looks useful :)
Many PRR questions are geared towards server side features, so, yeah, are not so applicable to CLI features.
/approve
# The milestone at which this feature was, or is targeted to be, at each stage. | ||
milestone: | ||
alpha: "v1.30" | ||
beta: "TBD" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just leave out the "TBD" ones.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ardaguclu, johnbelamaric, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |