-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux on VexRiscv #60
Comments
About atomics, there is some support in VexRiscv to provide LR/SC in a local way, it only work for single CPU systems. |
Yeah, "dummy" implementations that work on single CPU systems should be perfectly fine. |
As discussed at Free Silicon Conference together with @Dolu1990 , we are also working on it here: We can continue the discussion here for the CPU aspect. @daveshah1: i saw you made some progress, |
My current status is that I have made quite a few hacks to the kernel, vexriscv and LiteX, but I'm still only just getting into userspace and not anywhere useful yet. VexRiscv: https://github.com/daveshah1/VexRiscv/tree/Supervisor @Dolu1990 I would be interested if you could look at 818f1f6 - loads were always reading 0xffffffff from virtual memory addresses when bit 10 of the offset (0x400) was set. This seems to fix it, but I'm not sure if a better fix is possible As it stands, the current issue is a kernel panic "Oops - environment call from S-mode" shortly after |
Hi @daveshah1 @enjoy-digital :D So, for sure we will hit bugs in VexRiscv, as only the machine mode was properly tested.
I think the best would be to setup a minimal test environnement to run linux on. It would save us a lot of time and sanity. Especialy for a linux port project :D Does that sound good for you ? |
That sounds very sensible! The minimal peripheral requirement is low, just a timer (right now I have the LiteX timer connected to the timerInterruptS pin, and hacked the kernel to directly talk to that rather than the proper SBI route to setting up a timer) and a UART of some kind. My only concern with this is speed, right now it is taking about 30s on hardware at 75MHz to get to the point of failure. So definitely want to use Verilator and not iverilog... |
I can setup easily a verilator simulation. But 30s on hardware at 75MHz will still be a bit slow: we can expect 1MHz execution speed so that's still around 40 min... |
I did just manage to make a bit of progress on hardware (perhaps this talk of simulators is scaring it into behaviour 😄) It does reach userspace successfully, so we can almost say Linux is working. If I set /bin/sh as init, then I can even use shell builtins - being able to run |
@daveshah1 this is great. The libc segfault happened also in our REnode (https://github.com/renode/renode) emulation. Can you share the rootfs you're using? |
This is the initramdisk from antmicro/litex-linux-readme with a small change to inittab to remove some references to files that don't exist In terms of other outstanding issues, I also had to patch VexRiscv so that interrupts are routed to S-mode rather than M-mode. This broke the LiteX BIOS which expects M-mode interrupts, so I had to patch that to not expect interrupts at all, but that means there is now no useful UART output from the BIOS. I think a proper solution would be to select interrupt privilege dynamically somehow. |
We had to fix/workaround irq delegates. I think this code should be in our repo, but I'll check that again. |
The segfault I see is:
The bad address (0x73730 in libc-2.26.so) seems to be in
|
I checked the code, and it looks like all has been pushed to github. As for the segfault: Note that we had to re implement the mapping code in Linux + there are some hacks in the Vex MMU itself. This could be reason of the segfault as user space starts using the virtual memory very extensively. For example the whole kernel memory space is mapped directly and we bypass the MMU translation maps see: the kernel range is defined in MMU plugin instance: https://github.com/antmicro/VexRiscv/blob/97d04a5243bbfee9d1dfe56857f3490da9fe1091/src/main/scala/vexriscv/TestsWorkspace.scala#L98 I'm pretty sure there are many bugs hidden there :) |
Ok, I will think about the best way and how exactly setup that test environnement with the syncronised software golden model (to get max speed). |
@enjoy-digital The golden model is currently there |
@Dolu1990: in fact i already have the verilator simulation that is working fine, just need improve it a little bit load more easily the vmlinux.bin/vmlinux.dtb and initramdisk to ram. But yes, we'll use what it more convenient for you. I'll look at the your regression env and your golden model. |
@enjoy-digital Can you show me the verilator testbench sources :D ? |
@kgugala Which CPU configuration are you using, can you show me ? (The test workspace you pointer isn't using caches nor MMU) |
The config I am using is at https://github.com/daveshah1/VexRiscv-verilog/blob/linux/src/main/scala/vexriscv/GenCoreDefault.scala (which has a few small tweaks compared to @kgugala's, to skip over FENCEs for example). |
@enjoy-digital The checks between the golden model and the RTL are :
It should be enough to find out divegences fast. @daveshah1 Jumping over Fence instruction is probably fine for the moment. But jumping over iFence instruction isn't. There is no cache coherency between the instruction cache and the data cache. Need to use the caches fluch :) Is that used by some ways ? |
(Memory coherency issues is something which is automaticaly catched by the golden model / RTL cross checkes) |
As it stands it looks like all the memory has been set up as IO, which I suspect means the L1 caches won't be used at all - I think LiteX provides a single L2 cache. Indeed, to get useful performance proper use of caches and cache flushes will be needed. |
yes, we disabled the caches as they were causing a lot of troubles. It didn't make sense to fight both MMU and caches at the same time |
@daveshah1 Ok ^^ One thing to know, is the instruction cache do not support IO instruction fetch, instead it cache them. (Supporting IO instruction fetch cost area, and isn't realy a usefull think, as far i know ?) @kgugala The cacheless plugins aren't aware about the MMU.
To the roadmap would be :
|
TBH the real long term solution will be to reimplement the MMU so it is fully compliant with the spec. Then we can get rid of the custom mapping code in Linux and restore the original mainline memory mapping code used for RV64. I'm aware this will require quite significant amount of work in Vex itself. |
I don't think it would require that much work. MMU is a relatively easy piece of hardware. But what is the issue of a software refilled MMU ? If it use the machine mode to do it, it became transparent to the linux kernel right ? So no linux kernel modification required, but just a piece of machine mode code to have in addition of the raw Linux port :) ? |
Yes, I think an M-mode trap handler is the proper solution. We can probably use it to deal with any missing atomic instructions too. |
(troll on) |
It just may be difficult to push the custom mapping code to Linux' mainline |
The trap handler need not sit in Linux at all, it can be part of the bootloader. |
Will report back when we actually start using it. (Unfortunately not open source.) |
Sure, lets us know how it go, and please, share the improvements/fixes if there is any :) |
It seems to be working fine on hardware too :) |
indeed, it works on hardware |
@Dolu1990 You might want to unpin this issue now? |
Right ^^ |
Add LitexSoC workspace / linux loading. Need to emulate peripherals and adapte the kernel now. Probably also need some machine mode emulation Software time !
/sbin/init: error while loading shared libraries: libm.so.6: cannot stat shared object: Error 38
…scala to help reproduce
… regfile if the page was set as read only.
Fix DBusCached plugin access sharing for the MMU deadlock when exception is in the decode stage Fix IBusSimplePlugin issues with used with non regular configs + MMU Bring back the LinuxGen config into a light one
My intention with creating this issue is collecting/sharing information and gauging interest about running Linux on VecRiscv. From what I know, VexRiscv is still missing functionality, and it won't work out of the box.
A big problem is the MMU. Ideally, "someone" will hopefully write patches to add no-MMU support to Linux/RISC-V, but currently, a MMU is required. It appears VexRiscv has a partial MMU implementation using a software-filled TLB. There needs to be machine mode to walk the page tables and fill the TLBs, and I didn't find a reference implementation of that.
Another issue are atomics. Linux requires them currently. There seems to be partial support present in VexRiscv (a subset or so). Another possibility is patching the kernel not to use atomics if built without SMP support. There's also the question how much atomics support userspace typically requires.
Without doubt there are more issues that I don't know about.
Antmicro apparently made a Linux port: https://github.com/antmicro/litex-rv32-linux-system https://github.com/antmicro/litex-linux-riscv
I didn't know about this before and haven't managed to build the whole thing yet.
Unfortunately, their Linux kernel repository does not include the git history. Here's a diff against the apparent base: https://0x0.st/z-li.diff
Please post any other information you know.
The text was updated successfully, but these errors were encountered: