M5Ops #1214

mahyarsamani · 2024-06-10T20:33:00Z

mahyarsamani
Jun 10, 2024

Currently, the M5Ops are implemented in two ways: memory mapped version, and magic instructions. The current state makes building workloads and disk-images to work across different configurations difficult. For example if I were to start a simulation with the KVM cpu and exit out of the simulation loop when I reach the beginning of the ROI, switch the cpu and use a timing mode cpu to run the ROI and then exit at the end of the ROI the calls to M5Ops would be different for the marker marking the beginning of ROI (m5_work_begin_addr(0, 0)) and the marker marking the end of the ROI (m5_work_end(0, 0)) would be of different types (memory mapped for begin and magic instruction for end). This creates a dependency between what kind of cpu is used for what phase of the simulation and the type of m5 call that is made. Additionally, the memory mapped M5Ops require admin access to write to /dev/mem. This makes things specifically painful when workloads should be run without sudo access. For example, running things with mpirun should probably be done without sudo (although it is possible to run mpirun with sudo). This means in the case of MPI programs one solution is to boot with kvm, switch to atomic cpu (using memory mapped calls), run till the beginning of ROI, and then switch the cpu to o3 (using magic instructions).

I see two possible solutions, making memory mapped calls work with non-kvm cpus (my understanding is that non-kvm cpus don't work with memory mapped M5Ops), or make magic instructions work with KVM cpu. I personally like the second option better since we will not require admin access anymore. Would it be possible to catch the illegal instruction exception raised by KVM cpu in gem5 and handle the switch? This way we can actually emulate new instructions with the KVM cpu.

@andysan I would appreciate your thoughts and feedback on this. Do you happen to know/remember why the M5Ops were not implemented this way in the first place?

giactra · 2024-06-11T08:26:31Z

giactra
Jun 11, 2024
Maintainer

my understanding is that non-kvm cpus don't work with memory mapped M5Ops

Hi @mahyarsamani, it is definitely possible to use memory mapped m5ops for non-KVM cpus as well, so regardless of the sudo problem you can use the same interface for all CPUs

0 replies

andysan · 2024-06-11T16:37:20Z

andysan
Jun 11, 2024
Maintainer

It's unfortunately not possible to use the magic instructions in KVM. In principle, you could use a hypervisor call instruction on Arm to call into the hypervisor and then route that to gem5. However, that's not really practical since the KVM kernel API doesn't (unless this has changed recently) let you intercept such instructions in gem5. Another possibility would be to intercept undefined instruction fault (assuming the m5op magic sequence is guaranteed to generate an undefined instruction fault), but that is also mostly theoretical since the KVM API doesn't support this at the moment.

Your best option if you want to avoid the sudo issue is to write a custom Linux driver that creates a device node that can be memory mapped by unprivileged processes. This would solve the discoverability issue we have with memory mapped m5 ops as well since the MMIO addresses could be discovered using the device tree or an ACPI table.

0 replies

powerjg · 2024-06-12T00:56:41Z

powerjg
Jun 12, 2024
Maintainer

Another possibility would be to intercept undefined instruction fault (assuming the m5op magic sequence is guaranteed to generate an undefined instruction fault), but that is also mostly theoretical since the KVM API doesn't support this at the moment.

Can you expand more on why this isn't possible? The fault is thrown from KVM to the hypervisor (gem5) and results in an undefined instruction fault. Couldn't gem5 emulate that instruction and then keep going?

Your best option if you want to avoid the sudo issue is to write a custom Linux driver that creates a device node that can be memory mapped by unprivileged processes. This would solve the discoverability issue we have with memory mapped m5 ops as well since the MMIO addresses could be discovered using the device tree or an ACPI table.

Agreed. But it would be great if we could do this with minimal changes to the guest. The goal is to simulate what would be running on real hardware as closely as possible. E.g., I think it would be too much overhead to add an ioctl since this would cause a user->kernel switch. Enabling the mapping of a particular (and discoverable) physical address is a good middle ground, though.

In the end, what I really want is to be able run the same guest code in both KVM and gem5's CPU implementations and get the gem5 "bridge" instructions to work.

1 reply

andysan Jun 12, 2024
Maintainer

Another possibility would be to intercept undefined instruction fault (assuming the m5op magic sequence is guaranteed to generate an undefined instruction fault), but that is also mostly theoretical since the KVM API doesn't support this at the moment.

Can you expand more on why this isn't possible? The fault is thrown from KVM to the hypervisor (gem5) and results in an undefined instruction fault. Couldn't gem5 emulate that instruction and then keep going?

The current KVM kernel API doesn't let you intercept hypervisor calls or undefined instruction faults. The undefined fault will be delivered to the guest OS which will terminate the application. Hypervisor calls get handled by KVM in the kernel and never get forwarded to userspace. At least the Arm architecture supports routing such most faults to the hypervisor which is why this is a theoretical possibility. However, actually implementing it would require intrusive changes to the hypervisor.

Very early versions of the KVM CPU intercepted HVC instructions on Arm but that was before KVM support for Arm was merged into mainline and relied on custom patches.

Your best option if you want to avoid the sudo issue is to write a custom Linux driver that creates a device node that can be memory mapped by unprivileged processes. This would solve the discoverability issue we have with memory mapped m5 ops as well since the MMIO addresses could be discovered using the device tree or an ACPI table.

Agreed. But it would be great if we could do this with minimal changes to the guest. The goal is to simulate what would be running on real hardware as closely as possible. E.g., I think it would be too much overhead to add an ioctl since this would cause a user->kernel switch. Enabling the mapping of a particular (and discoverable) physical address is a good middle ground, though.

You actually don't need an ioctl. You just map the m5op region straight into userspace in the exact same way as you map /dev/mem. The only difference compared to a real system would be the additional TLB entry and potentially a page table walk. In fact, you could just map plain memory into that region when running on a real system and things would just work. The m5op instructions already perturb timing (they are non-speculative) so I don't think the fact that they require an address mapping make much of a difference in practice. This obviously assumes that the mapping has been created once at startup and then reused every time you call into gem5.

In the end, what I really want is to be able run the same guest code in both KVM and gem5's CPU implementations and get the gem5 "bridge" instructions to work.

The memory mapped interface is pretty much your only option in that case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gem5

M5Ops #1214

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

gem5

M5Ops #1214

mahyarsamani Jun 10, 2024

Replies: 3 comments · 1 reply

giactra Jun 11, 2024 Maintainer

andysan Jun 11, 2024 Maintainer

powerjg Jun 12, 2024 Maintainer

andysan Jun 12, 2024 Maintainer

mahyarsamani
Jun 10, 2024

Replies: 3 comments 1 reply

giactra
Jun 11, 2024
Maintainer

andysan
Jun 11, 2024
Maintainer

powerjg
Jun 12, 2024
Maintainer

andysan Jun 12, 2024
Maintainer