KEP-4815: DRA Partitionable Devices #4874

klueska · 2024-09-25T11:19:59Z

One-line PR description: Adding new KEP for Partitionable Devices in DRA
Issue link: DRA: Add support for partitionable devices #4815
Other comments:

pohly · 2024-09-25T15:44:11Z

/cc

klueska · 2024-09-26T14:10:58Z

I haven't added any text yet (i plan to do that over the next few days), but I wanted to at least get the proposed API changes along with an example pushed so that those that are familiar with what we want to do here can start taking a look.

johnbelamaric

I took a quick look over this. If I understand correctly, you are allowing the creation of "abstract" devices which are then used as sort of "mixins" in the concrete devices via the "extends" attribute. This allows us to capture various aspects of the general shape of devices, and mix-and-match those aspects in the concrete devices.

The first ~1150 lines of the yaml capture those abstract devices, and the rest capture the many concrete instances across 4 GPUs. Do you have a "back-of-the-envelope" size calculation for the YAML similar to what we did here?

I don't love the "abstract" device thing. It's confusing, IMO. I would prefer a separate data type for the "shapes" rather than re-using "device". What would be the implications of that?

keps/sig-node/4815-dra-partitionable-devices/README.md

klueska · 2024-09-26T15:24:41Z

I took a quick look over this. If I understand correctly, you are allowing the creation of "abstract" devices which are then used as sort of "mixins" in the concrete devices via the "extends" attribute. This allows us to capture various aspects of the general shape of devices, and mix-and-match those aspects in the concrete devices.

Yes. That's the idea.

The first ~1150 lines of the yaml capture those abstract devices, and the rest capture the many concrete instances across 4 GPUs.

It's actually 8 GPUs.

Do you have a "back-of-the-envelope" size calculation for the YAML similar to what we did here?

The file size of the example I included here is 48K. I wil

I don't love the "abstract" device thing. It's confusing, IMO. I would prefer a separate data type for the "shapes" rather than re-using "device". What would be the implications of that?

It shouldn't matter since it is a requirement that all devices included in "extends" be abstract. I actually find it fairly intuitive to categorize them as devices (but mark them as abstract) since they can (and by definition do) include all of the same fields as a device. That said, I'd also be fine moving them under DeviceExtensions as a new DeviceShape or DeviceTemplate type.

klueska · 2024-09-27T09:20:27Z

Based on @johnbelamaric's comments I have updated this to include the explicit notion of a mixin. I am happy with the change.

Text coming throughout the day and over the weekend...

johnbelamaric

I like the "mixin" a lot more, it's easier to understand the relationships between things. I do wonder though if this is just a new DeviceType rather than an extension of BasicDevice.

keps/sig-node/4815-dra-partitionable-devices/README.md

pohly · 2024-09-30T12:46:00Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+      - name: mig-1g.5gb
+      - name: memory-slices-0
+      consumesCapacityFrom:
+      - name: gpu-0


You already said that you want to change it, so perhaps this will become clear soon, but let me call it out explicitly here: it's not obvious to me what this gpu-0-mig-1g.5gb-0 device consumes from gpu-0. Presumably any capacity defined for the device (whether it's in the device itself or in a mixin) gets consumed?

Yeah, as it's defined right now, all capacity must be consumed from somewhere if this field is populated. There is no way to list capacities particular to this device that aren't consumed from elsewhere.

That said, there is (technically) nothing preventing you from listing yourself as a source here. That way you would pull all resources available on GPU 0 from there first, falling back to yourself for those that are only specific to you.

That said, there is (technically) nothing preventing you from listing yourself as a source here.

Hmm, isn't consumption said to be recursive? Referencing yourself would lead to infinite recursion.

Which might be an important point: allowing a driver to publish something that can lead to infinite recursion should be prevented by validation.

This is related to @thockin's comment here:
https://github.com/kubernetes/enhancements/pull/4874/files#r1795977481

pohly · 2024-09-30T12:52:11Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+		Includes[]DeviceMixinRef
+
+		// ConsumesCapacityFrom defines the set of devices where any capacity
+		// consumed by this device should be pulled from. This applies recursively.


This "pulled from" implies that there must be a device that can be pulled from. That device itself gets listed and thus can get allocated.

Suppose we have a flat list of identical devices which have some conflicting capacity requirements. We would need to define a fictional additional device that provides this scarce capacity. To Kubernetes, that device looks like something that it can allocate and a request for a "real" device will match it (same driver name, for example).

This is not insurmountable: the fictional device could have "fictional: true" attribute and the DeviceClass selector could exclude devices where that value is true. I just wanted to call this out. Might be worth adding to the KEP as an example.

Yes, this is one limitation of this approach (which I thought of but didn't call out explicitly here).

My original proposal (from many months ago) allowed for a separate place to list "shared" capacities that weren't attached to any particular device. Individual devices could then pull rom this list of shared capacities rather than pulling from another device explicitly.

So in the case of GPUs with MIG, we would have one shared capacity per full GPU from which both full GPUs and MIG devices would pull.

@thockin wasn't a fan of this approach though as he thought it seemed too abstract to have a "bag" of resources just declared in a resource slice but not explicitly tied to any particular physical device. At this point, I'm honestly OK with either approach, it's more about what is the common case, and I think needing to pull from some abstract source is actually the outlier, so it's fine to optimize for the more intuitive case.

Regarding having a fictional: true attribute. The first version of this KEP had something similar with an explicit field of abstract: true (not an attribute but an actual top-level field in the Device struct). Having an attribute is just as good (maybe better) though, so long as your DeviceClass includes the ability to filter out these types of devices from actually being allocated.

A smaller API is better, so I prefer the approach with the attribute. We just need to explain it because it wouldn't be obvious to driver authors how to do this.

This is related to @thockin's comment here:
#4874 (comment)

keps/sig-node/4815-dra-partitionable-devices/README.md

mrunalp · 2024-10-09T13:01:12Z

/approve
Leaving finalizing the shape of API and lgtm for API/scheduling.

johnbelamaric · 2024-10-09T15:05:43Z

/approve

alculquicondor

/approve
from SIG Scheduling

My only concern is from the observability PoV. The question "why did my pod not schedule" is now significantly harder to answer.
Some information about the usage of a ResourceSlice could be helpful, akin to the output of kubectl describe node:

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests         Limits
  --------           --------         ------
  cpu                284m (30%)       2045m (217%)
  memory             539316480 (18%)  4007175680 (136%)
  ephemeral-storage  0 (0%)           0 (0%)
  hugepages-1Gi      0 (0%)           0 (0%)
  hugepages-2Mi      0 (0%)           0 (0%)

k8s-ci-robot · 2024-10-09T17:53:01Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, johnbelamaric, klueska, mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [johnbelamaric]
~~keps/sig-node/OWNERS~~ [johnbelamaric,mrunalp]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mrunalp · 2024-10-09T19:07:56Z

Some information about the usage of a ResourceSlice could be helpful, akin to the output of kubectl describe node:

Maybe some summary with capacity per device and consumed / available will be useful 👍

klueska · 2024-10-09T19:12:52Z

Agreed, though that is out of scope for this KEP

kannon92 · 2024-10-09T19:15:15Z

From what I understand, this information would be available via a ResourceClaim. I don't think we expect users to look into ResourceSlices for scheduling decisions.

alculquicondor · 2024-10-09T19:30:28Z

Agreed, though that is out of scope for this KEP

Not necessarily. It could be left for beta, though.

pohly

Mostly just some nits, almost done.

keps/sig-node/4815-dra-partitionable-devices/README.md

pohly · 2024-10-10T10:38:11Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+	//
+	// +optional
+	// +oneOf=deviceMixinType
+	Composite *CompositeDeviceMixin `json:"composite,omitempty"`


What other mixins do we expect?

Not sure -- @johnbelamaric had the same question. I'm fine removing this level of indirection if that is the consensus.

It just seemed symmetrical with how we treated devices, where a CompositeDevice can only pull mixins of type CompositeDeviceMixin. In the future, AdvancedDevice would only be allowed to pull mixins of type AdvancedDeviceMixin (which may actually be identical to CompositeDeviceMixin, but doesn't have to be).

That's true. Let's keep it. As it's not really a user-facing API (not written by a user, only read for debugging) the extra indirection doesn't hurt that much.

/lgtm

I am also skeptical that we need to extend here, we can look at it in code review

I'm not opposed to making it flat. It makes it asymmetrical with what we did for devices, but maybe that's OK.

keps/sig-node/4815-dra-partitionable-devices/README.md

Signed-off-by: Kevin Klues <[email protected]>

thockin

John pointed out that some of my comments were made elsewhere, so I apologize - the thread is super long and I thought it made sense to send these sooner rather than later.

I don't hate this. I actualy find the ideas rather elegant, but there are some things that nag at me, like some semi-unstated property that this asserts which may not be true, but I can't actively disprove.

It assumes that every resource is allocated hierarchically and that every level of that hierarchy is itself allocatable. Again, I can't find an example where it's not true, but it feels like an example MUST exist.

I am OK to pursue this KEP, so the fact that it merged already is OK.

thockin · 2024-10-10T19:02:08Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+        matchAttribute: "gpu.nvidia.com/parentUUID"
+```
+
+This would result in the following set of non-overlapping partitions:


This case is interesting - it highlights that either users need to be careful to specify things or the specific-device selection logic needs to be smart about seemingly-equivalent choices.

The table below works, but it could just as easily have chosen:

+------------+------------+------------+------------+------------+------------+------------+------------+ | MemSlice 0 | MemSlice 1 | MemSlice 2 | MemSlice 3 | MemSlice 4 | Memslice 5 | MemSlice 6 | MemSlice 7 | +------------+------------+------------+------------+------------+------------+------------+------------+ | 3g.20gb | X | +------------+------------+------------+------------+------------+------------+------------+------------+ | X | X | 2g.10gb | X | +------------+------------+------------+------------+------------+------------+------------+------------+ | X | X | X | X | X | X | 1g.5gb | XXX | <--- Bad! The last 1g.5gb can't be allocated +------------+------------+------------+------------+------------+------------+------------+------------+

Right?

The scheduler might attempt to do this, but in doing so it would not be able to allocate all requests. I would then need to backtrack and try a different combination. Specific logic for doing this with MIG devices can be seen here: https://github.com/NVIDIA/mig-parted/blob/main/pkg/mig/config/config.go#L125. We would need to generalize this for the abstractions being introduced here.

thockin · 2024-10-10T19:20:04Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+	//
+	// +optional
+	// +oneOf=deviceMixinType
+	Composite *CompositeDeviceMixin `json:"composite,omitempty"`


I am also skeptical that we need to extend here, we can look at it in code review

thockin · 2024-10-10T19:22:35Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+	//
+	// +optional
+	// +oneOf=deviceType
+	Composite *CompositeDevice `json:"composite,omitempty"`


For later: I am not in love with "compsite" here - it sounds like "compound" which we used (informally) to describe a different idea.

Yeah. I originally called it PartitionableDevice but that put the emphasis on the the consumeCapacityFrom field and ignored the new mixin field.

I'd love it if we could just add both of these new fields to the existing BasicDevice type (since they are strictly extensions), but as @pohly keeps pointing out (because I keep forgetting) we can't extend the API in this way because it breaks downgrades. An old scheduler will not recognize the new fields and attempt to allocate a device based solely on the fields that it DOES know about. For mixins / attributes this is not too bad because selection would simply fail if the scheduler was not able to see the additional attributes that a mixin provides. Not honoring consumesCapacityFrom properly would be unacceptable though.

thockin · 2024-10-10T19:23:38Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+	//
+	// The propertes of each included mixin are applied to this device in
+	// order. Conflicting properties from multiple mixins are taken from the
+	// last mixin listed that contains them.


Could we validate that they don't collide? Or is there a useful reason to allow it?

If we restrict mixins to th scope of a ResourceSlice rather than the scope of an entire pool(which may span multiple resource slices, they yes, I think we can do this validation. In conversations with @pohly I think we want to make this restriction anyway to make validation easier in general. It means that some mixins may be duplicated across multiple resource slices, but I think that is probably OK for the sake of validation.

thockin · 2024-10-10T19:27:04Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+	// The maximum number of attributes and capacities combined is 32.
+	//
+	// +optional
+	Capacity map[QualifiedName]DeviceCapacity `json:"capacity,omitempty"`


Would it make sense to put the ConsumesCapacityFrom inside DeviceCapacity? or in a parallel map?

thockin · 2024-10-10T19:36:51Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+        quantity: 40Gi
+devices:
+- name: gpu-0
+  composite:


It looks like we don't really even need the composite indirection? What if we just extended basic ? As we move beyond alpha, we can consider whether we even REALLY need basic.

Yeah, this goes back to the comment above...

I'd love it if we could just add both of these new fields to the existing BasicDevice type (since they are strictly extensions), but as @pohly keeps pointing out (because I keep forgetting) we can't extend the API in this way because it breaks downgrades. An old scheduler will not recognize the new fields and attempt to allocate a device based solely on the fields that it DOES know about. For mixins / attributes this is not too bad because selection would simply fail if the scheduler was not able to see the additional attributes that a mixin provides. Not honoring consumesCapacityFrom properly would be unacceptable though.

thockin · 2024-10-10T19:39:26Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+        string: GPU-662077db-fa3f-0d8f-9502-21ab0ef058a2
+```
+
+As you can see, three mixins are created called "system-attributes",


At some point we talked about per-slice attribs thsat would be "inherited". This is similar but more explicit (both good and bad aspects). If we know EVERY device will have common things, my name them in each Device? It's sort of noisy.

It would be good to see an example where mixins were not identical in every device.

In the real-world MIG use case (later in the KEP), only one mixin (with 3 attributes) is shared by all of the devices in the slice. The vast majority of mixins are only referenced by a subset of devices.

thockin · 2024-10-10T19:52:30Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+In this example, five devices are defined: a full GPU called "gpu-0" and four
+partitions of "gpu-0" called "gpu-0-partion-0", "gpu-0-partion-1",
+"gpu-0-partion-2", and "gpu-0-partion-3" respectively. The full GPU has 40Gi of
+memory available, and each of the partitions pull 10Gi from it.


Do we plan to validate that the sum of 1st level children is less than the parent? E.g. Could I define gpu0 with 40Gi and 4 children each with 20Gi ? That could be an error (e.g. cut-paste) or it could be that they have some resources "dedicated" to them that are not in the parent or it could some sort of fungible pool?

What if the children declare a capacity "foobar" which is not named in the parent? Is that an error?

Is there a need for top-level resources from which children can consume but are not themselves allocatable? e.g. This card has 8 frobnicators and each child device needs 1 or 2, but there's no allocatable device with 8?

Remember -- no one is defining these resource slices by hand (the driver publishes these), so the likelihood of a cut-paste-error is minimal. I don't think its worth trying to define semantics that would enable us to recognize that "gpu0 with 40Gi and 4 children each with 20Gi" is an invalid configuration. Driver implementors should be responsible for this themselves.

What if the children declare a capacity "foobar" which is not named in the parent? Is that an error?

This would be hard to validate if the parent and child are in different resource slices. I think we can (and should) do our best to validate it in cases where they are in the same slice, but not error out if they are not.

Is there a need for top-level resources from which children can consume but are not themselves allocatable? e.g. This card has 8 frobnicators and each child device needs 1 or 2, but there's no allocatable device with 8?

Yes, there are use cases for this. In fact, this is the exact use case @byako pointed out for SR-IOV VFs. You can achieve this in the current design by defining your DeviceClass such that the top level device is never selectable, but its a bit back-handed since this is not encoded anywhere in the ResourceSlice API itself.

We toyed with the idea of adding an extra field to a device to accommodate this case (e.g. abstract: true or allocatable: false), but never made it part of the design. I think the conclusion was that we could rely on DeviceClasses for now and think more deeply about this later.

Happy to continue discussing during implementation though.

thockin · 2024-10-10T20:21:41Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+        quantity: "1"
+      memory:
+        quantity: 40Gi
+      memorySlice0:


I don't love this aspect. At some point we talked about "named mutexes" or something, so this would looke more like:

attributes: ... capacity: memory: quantity: 40Gi consumes: - memorySlice1

I guess it's somewhat minor -- it has to be listed SOMEWHERE. It just feels
weird to be in capacity, as if someone would want to write a claim against
it...

Agreed -- but this is where we are at the moment with the design.

I had also talked about introducing an intRange abstraction at one point so that one could directly write:

consumes: memorySlices: "0-3"

but that had it's own problems.

thockin · 2024-10-10T20:51:47Z

keps/sig-node/4815-dra-partitionable-devices/README.md

+        string: NVIDIA A100-SXM4-40GB
+      type:
+        string: mig
+- name: memory-slices-0


The next ~100 lines of YAML really suggest to me that we're doing somethign wrong

All of this would be unnecessary if we had an intRange or a set abstraction to work with. It only looks as nasty as it does because we need some way of encoding an explicit "set" of resources, but all we have at our disposable is the ability to define a "quantity" of resources.

For example, this mixin:

- name: memory-slices-0-3 composite: capacity: memorySlice0: quantity: "1" memorySlice1: quantity: "1" memorySlice2: quantity: "1" memorySlice3: quantity: "1"

would turn into a specific capacity entry on the device itself of something like:

capacity: memorySlices: [0,1,2,3] // if we supported sets

or

capacity: memorySlices: "0-3" // if we supported intRanges

There would be no need for this long list of mixins defining valid memorySlice combinations at all.

Given that this is what we have to work with for now, I think it's acceptable, but we should look into writing a future KEP that introduces set and/or intRange as a usable abstraction here.

johnbelamaric · 2024-10-11T00:28:23Z

Thanks Tim. @klueska, @catblade I read Tim's input as saying he is OK with abstract pools :). And figuring out how to define local capacities above and beyond pooled capacities.

One other thing he and I discussed is the idea of adding back in one level of inheritance. I didn't like the original proposal because the complex tree was hard to parse in your head. But we probably could allow the user to define the mixins, and ALSO then to use those mixins to define shapes. Both of those are abstract, just like today. A shape is just a named collection of mixins. But it's not a tree. Then, an individual device has a shape, plus locally defined attributes and capacities (no mixins directly in the devices, I think?).

Our device list then becomes more compact, you don't have to repeat the mixins over and over. But it's just one level deep, not N levels deep. WDYT?

thockin · 2024-10-11T01:29:37Z

I read Tim's input as saying he is OK with abstract pools

I would want to see what exactly that means before signing off on it :)

One other thing he and I discussed is the idea of adding back in one level of inheritance

What I meant was more like this:

Each Slice can define attributes - all devices in the slice implicitly get those attributes
Each slice can define mixins - some devices in the slice explicitly get those attributes via includes
Each device can define attributes unique to that device

johnbelamaric · 2024-10-11T02:01:01Z

I think a "shape" concept is useful, it avoids the repeated includes for every device of the same shape.

thockin · 2024-10-11T04:20:38Z

Too many levels of indirection will make things harder, not easier. I am somewhat skeptical that we need need both shapes and mixins

catblade · 2024-10-11T20:27:53Z

I'm thinking of this in terms of set notation. Trying to sort out what that looks like. IE item A is part of set C, but set C also includes D, E, and F. I want to be able to ask for a C, but I also want to be able to ask for an A if I absolutely need an A. I'll look at this later next week and see if I can make sense of it. What number of levels of abstraction can I get?

klueska · 2024-10-24T10:04:29Z

@thockin

What I meant was more like this:

Each Slice can define attributes - all devices in the slice implicitly get those attributes

Each slice can define mixins - some devices in the slice explicitly get those attributes via includes

Each device can define attributes unique to that device

In my experience there are very few attributes that actually get referenced by all devices, so I' don't think introducing a "top-level" attributes concept is worth it. The other bullet points are already supported in the current design.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 25, 2024

k8s-ci-robot requested review from derekwaynecarr and johnbelamaric September 25, 2024 11:20

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 25, 2024

klueska mentioned this pull request Sep 25, 2024

DRA: Add support for partitionable devices #4815

Open

4 tasks

k8s-ci-robot requested a review from pohly September 25, 2024 15:44

klueska force-pushed the dra-partitonable-devices branch from c9c762f to a845fd3 Compare September 26, 2024 14:08

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 26, 2024

johnbelamaric reviewed Sep 26, 2024

View reviewed changes

keps/sig-node/4815-dra-partitionable-devices/README.md Outdated Show resolved Hide resolved

keps/sig-node/4815-dra-partitionable-devices/README.md Outdated Show resolved Hide resolved

klueska force-pushed the dra-partitonable-devices branch from a845fd3 to 9adb6cd Compare September 26, 2024 15:08

klueska force-pushed the dra-partitonable-devices branch 2 times, most recently from a514f0b to 7ec6e85 Compare September 27, 2024 09:19

johnbelamaric reviewed Sep 27, 2024

View reviewed changes

keps/sig-node/4815-dra-partitionable-devices/README.md Outdated Show resolved Hide resolved

keps/sig-node/4815-dra-partitionable-devices/README.md Outdated Show resolved Hide resolved

keps/sig-node/4815-dra-partitionable-devices/README.md Outdated Show resolved Hide resolved

klueska mentioned this pull request Sep 29, 2024

Add implementation of partitionable devices to match KEP kubernetes-sigs/wg-device-management#38

Open

pohly reviewed Sep 30, 2024

View reviewed changes

klueska force-pushed the dra-partitonable-devices branch from 7ec6e85 to 841efd5 Compare October 1, 2024 08:22

klueska commented Oct 1, 2024

View reviewed changes

keps/sig-node/4815-dra-partitionable-devices/README.md Outdated Show resolved Hide resolved

klueska force-pushed the dra-partitonable-devices branch 3 times, most recently from a60a3e2 to b9f4313 Compare October 1, 2024 17:55

alculquicondor reviewed Oct 1, 2024

View reviewed changes

klueska force-pushed the dra-partitonable-devices branch from 0d59a82 to 4c41571 Compare October 2, 2024 11:48

mrunalp reviewed Oct 9, 2024

View reviewed changes

keps/sig-node/4815-dra-partitionable-devices/README.md Outdated Show resolved Hide resolved

mrunalp reviewed Oct 9, 2024

View reviewed changes

keps/sig-node/4815-dra-partitionable-devices/README.md Show resolved Hide resolved

klueska force-pushed the dra-partitonable-devices branch from 45cd82a to 8a6096e Compare October 9, 2024 12:42

klueska force-pushed the dra-partitonable-devices branch 2 times, most recently from 16dc163 to 9ed882e Compare October 9, 2024 13:36

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 9, 2024

klueska force-pushed the dra-partitonable-devices branch from 9ed882e to 6ed63f0 Compare October 9, 2024 15:41

alculquicondor reviewed Oct 9, 2024

View reviewed changes

pohly requested changes Oct 10, 2024

View reviewed changes

Add KEP for DRA: Partitionable Devices

53e1584

Signed-off-by: Kevin Klues <[email protected]>

klueska force-pushed the dra-partitonable-devices branch from 6ed63f0 to 53e1584 Compare October 10, 2024 11:27

k8s-ci-robot assigned pohly Oct 10, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 10, 2024

k8s-ci-robot merged commit 18c41ea into kubernetes:master Oct 10, 2024
4 checks passed

k8s-ci-robot added this to the v1.32 milestone Oct 10, 2024

thockin reviewed Oct 10, 2024

View reviewed changes

ace-cohere mentioned this pull request Nov 26, 2024

NVlink support NVIDIA/k8s-dra-driver#174

Open

KEP-4815: DRA Partitionable Devices #4874

KEP-4815: DRA Partitionable Devices #4874

Conversation

klueska commented Sep 25, 2024

pohly commented Sep 25, 2024

klueska commented Sep 26, 2024 • edited Loading

johnbelamaric left a comment

Choose a reason for hiding this comment

klueska commented Sep 26, 2024

klueska commented Sep 27, 2024

johnbelamaric left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pohly Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klueska Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrunalp commented Oct 9, 2024

johnbelamaric commented Oct 9, 2024

alculquicondor left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Oct 9, 2024

mrunalp commented Oct 9, 2024 • edited Loading

klueska commented Oct 9, 2024

kannon92 commented Oct 9, 2024

alculquicondor commented Oct 9, 2024

pohly left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thockin left a comment

Choose a reason for hiding this comment

thockin Oct 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klueska Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnbelamaric commented Oct 11, 2024

thockin commented Oct 11, 2024

johnbelamaric commented Oct 11, 2024

thockin commented Oct 11, 2024

catblade commented Oct 11, 2024

klueska commented Oct 24, 2024

klueska commented Sep 26, 2024 •

edited

Loading

pohly Sep 30, 2024 •

edited

Loading

klueska Sep 30, 2024 •

edited

Loading

mrunalp commented Oct 9, 2024 •

edited

Loading

thockin Oct 10, 2024 •

edited

Loading

klueska Oct 24, 2024 •

edited

Loading