-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pillar: Add support to CDI for native containers #4265
Conversation
go mod tidy and go mod vendor Signed-off-by: Renê de Souza Pinto <[email protected]>
Two questions. First, I had thought that the CDI was going to be in Second, I could not figure out where you consume the CDI files. Or are you assuming that by default, it will look in |
I just need to inform the device string, containerd's CDI plugin does the whole job to look for the CDI files (in the directory configured in config.toml), consume them and populate the OCI spec... |
bc9c930
to
391d4b9
Compare
Isn't that system containerd, and we want this for user containerd? Although I spent time with Paul earlier today that showed that user apps are running on system containerd? I lost track. 🤷♂️
So what would be the process for a specific vendor? Root filesystem is immutable, so We do have the pillar
Different containerd? 🤷♂️ |
No, user containerd is used only for CAS, user apps run on system's containerd....
Me too, I have a security concern about allow runtime CDI files, so the assumption is that any CDI file must be part of rootfs build, which, IMO makes sense since they are usually very hardware specific and will be provided by specific packages, such as the pkg/nvidia....
Unfortunately not, the containerd config is correct, it's the system's containerd.... |
The Container Device Interface (CDI) is already available on containerd and that are CDI files provided for NVIDIA platform that describes GPU devices for native container access. This commit introduces the support for CDI on pillar, so Video I/O adapters can pass a CDI device name to enable the direct access on native containers. Signed-off-by: Renê de Souza Pinto <[email protected]>
391d4b9
to
e99de15
Compare
Yeah, that came up yesterday. We probably should change that, but well beyond the scope of this PR.
What happens if you have 100 different devices, all of the same family, with slightly different CDI? Are you going to have 100 different rootfs builds? Or just one, with multiple CDIs, and the ability to detect each? This gets us very much down the road of different builds because of a single few KB config file in
So, we mount I didn't quite get what we are doing with that chunk of code inside pillar. We inject the devices into the container spec based on the name. Essentially, we are duplicating what containerd normally does? |
It's totally fine to have multiple CDI files under /etc/cdi, we don't need a single build per device. Actually, that's how it's working for NVIDIA, we have CDI files for both Xavier + Orin boards on the same build. Inside each file, we use different names for device description, so we have "nvidia.com/xavier-gpu" for Xavier and "nvidia.com/orin-gpu" for Orin...
Yes, this can be improved when we move execution of Edge Apps to user containerd...
No, this parses the I/O adapters list from the device model. The way we give direct access to GPU is exactly the same way as we use for passthrough PCI devices, the difference is that instead of giving a PCIe Bus Address in the device model, we pass the CDI string for the particular device. The CDI is only used for GPU access for now, for Serial devices and any other device (like a webcam under /dev/video0) we read them from the device model and add them to the OCI spec. For standard containers (with ShimVM) we just give full access to all file devices. In the documentation being added you can see some examples and a better description: https://github.com/lf-edge/eve/pull/4265/files#diff-8230cff5878b3df207474c79828836840673a5ac49fdc808ba034809902cac96 |
OK, that works, thanks.
So this is about translating between what came in the EVE API request for devices to pass to the app, and the CDI file format, so that it will know what to do with it? That would make sense to me, but the doc to which you linked implies that the spec comes with the CDI attribute? So why translate? |
I don't know if I understood your question, but the main idea is that the CDI string works as a "hardware ID", the same way we specify PCIe Bus Address, network interface names (eth0, wlan0), etc, in the hardware device model, for this particular case we specify the CDI string which points to the device described in the CDI file.... this makes the "native container GPU passthrough" work transparently with the controller, so the user can "pass-through" a GPU for a native container in the same way it pass-through a PCI Video card on x86, for example.... |
} | ||
``` | ||
|
||
If two containers need to access the same GPU (when supported), two I/O adapters must be defined. For instance, two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the number of these files depends on how many App Instances will use them, am I right? It means: dynamic? In this case, is it ok to store the file in the RO partition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, but will let Rene confirm. You have one or more CDI yaml files that describe the devices on the device and how to access them (including devices, and mounts necessary and all sorts of instruction stuff). Whether it is 1 or 50 apps, only the defined CDI files for the device are used. They are read-only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly @deitch ... @OhmSpectator , we must have one GPU I/O adapter per Edge App because an I/O adapter can be "passed-through" to only one VM, makes total sense for PCI devices, but we do have some exceptions, like CAN interfaces (that can be accessed by multiple VMs at the same time) and now this use case, where a single GPU can be shared across multiple containers... but this is only about device model, there is nothing to do with CDI files.....
I was just thinking about this. How do you distinguish between them? Aren't the device names similar? Won't you have conflicts? |
What I meant was, if we already define the CDI string inside the app instance, e.g. |
We can naming these devices whatever we want, originally nvidia-ctk will always use "nvidia.com/gpu" during the CDI generation, I just changed them to |
We don't define CDI string inside the Edge App, the Edge App should be as any regular Edge App, the only requirement is to have NO_HYPER as the virtualization mode. Then, we are going to passthrough a GPU I/O adapter to this Edge App, as any regular PCI passthrough... the trick happens when we parse the I/O adapters from the device model and found the "cdi" attribute under "cbattr", so for native containers (and only for native containers) we will use this string as a CDI device and processing accordingly.... This approach makes the CDI solution 100% compatible with the current passthrough mechanism and it requires no changes neither in the API nor in the controller side.... |
Ok, now that makes sense. So there still is a "translation" going on between "how GPU appears in EVE API" and "how GPU is listed in CDI files". The work in pillar is there to do that translation. Correct? Can we capture that in the docs? |
Correct. Ok, I will update the documentation... |
Add proper documentation explaning about the CDI support for bare metal containers. Signed-off-by: Renê de Souza Pinto <[email protected]>
e99de15
to
d275fb1
Compare
Updates in this PR:
|
The Container Device Interface (CDI) is already available on containerd and that are CDI files provided for NVIDIA platform that describes GPU devices for native container access. This commit introduces the support for CDI on pillar, so Video I/O adapters can pass a CDI device name to enable the direct access on native containers.
The corresponding documentation is also provided.