-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split CRG functionality for U and S-modes #491
base: main
Are you sure you want to change the base?
Conversation
To provide a bit of context, this was prompted by questions from @buxtonpaul about how to use the PTE bits, which involved setting up a null configuration for kernel-privileged pages. This then made obvious that the kernel pages were guarded by the same generation as user pages, and so sstatus would need to be swapped on kernel entry, which we believe is not the case currently. A better solution is to have a dedicated bit for the supervisor generation. |
I misinterpreted the linked issue as being about kernel revocation and adding extra PTE bits, not about ensuring the kernel can actually load from kernel pages regardless of userspace's generation for userspace pages. Do we think that having sstatus.SCRG would form part of the story for kernel revocation? If there's a risk it's not what we'd want then it might be better to leave the sstatus bit (and PTE bit) reserved and say that there are no generation checks for S-mode pages. But probably it's something we'll need... |
@jrtc27 Has the right of it: kernel access to user memory should generally be gated, for revocation purposes, as if it were performed by the user program. The existing implementation performs its page sweeps by bypassing the PTE bits and their load barrier effect by using the "direct map" region of the address space, manually translating virtual addresses to (offset) physical ones. These accesses would look, to hardware, like the kernel accessing kernel mappings. There is a proposal for a privileged "as user capability load, bypassing the CRG fault" instruction that would let us simplify much of that logic down to allowing pages to be swept directly in virtual address space (and possibly even concurrently with address space manipulation, currently emphatically prohibited due to the direct access to physical memory involved), but that's still a lot of engineering and measurement to make sure it's a good idea to be done. All of that to say, I'm not sure a "CRGS" bit is the right idea. Until someone cracks kernel sweeping revocation, a much more challenging problem than for userspace memory, the kernel's own access to virtual memory will be to pages that are almost surely parked in some capability-permissive but CRG-less PTE configuration, assuming we have one of those. If we don't, it's probably simpler to just not raise CRG faults on PTEs with their U bit clear? |
I'm certainly on-board with the far simpler, "non-user privileged pages don't ever get generation traps." With apologies to @tariqkurd-repo for dragging him through the mud, if we did agree on that one, would it be better to modify this pull request into that change to keep the history and discussion? Or start a new one? (Such a modification would basically be reverting most/all of it and adding a note somewhere. I guess.) |
Two big changes here in the last commit:
and so Zcheripte includes CRG but not CW Overall, this PR:
|
|
||
If {cheri_pte_ext_name} is _not_ implemented then PTE.CRG and <<sstatusreg_pte,sstatus>>.UCRG are considered to exist but to be read-only-zero for the purpose of the specification, however a CHERI-aware hart running a VM-enabled OS is strongly recommended to support {cheri_pte_ext_name}. | ||
NOTE: The description below assumes that {cheri_pte_ext_name} has been implemented. | ||
If that is _not_ the case then PTE.CRG and <<sstatusreg_pte,sstatus>>.UCRG should be taken as read-only-zero for purpose of the description in the remainder of this chapter only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ok now in that what it says is what we want to be true, but it would be better to split the description up into first CW on its own and then in some Svucrg1 what CRG adds on top, which is then text you can just not read if you don't implement that. But if you want to defer that to at least get the spec correct first that's ok.
Footnotes
-
A more RISC-V-y name for what is currently called Zcheripte, which is definitely not a correct name for the thing given it (a) is a privileged spec extension (so should be Sfoo) (b) pertains to virtual memory (so should be Svfoo) ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it would be better to split the descriptions but I can't do that in the near term, so I think this will have to do for now.
|=== | ||
|
||
^1^ This is the current privilege mode, not the effective mode of the access and so is not affected by <<sstatusreg_pte,sstatus>>.SUM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think sstatus.SUM
has anything to do with effective privilege mode, which regards mstatus.prv
and mstatus.mpp
. So I think this needs to be rephrased.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If mstatus.MPRV=1 then all data memory accesses are performed as if in mode mstatus.MPP. So if mstatus.MPP=01 (S) then sstatus.SUM applies to all M-mode's data memory accesses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it must be affected by SUM. It is vital that, when the kernel copies memory to/from userspace, it follows exactly the same restrictions as userspace. Otherwise you could use a system call as a gadget to copy a to-be-revoked capability from a not-yet-swept page to a swept one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, whether UCRG is used or not must be a function solely of the U bit, not the current mode as well. (Either the access is being done from U-mode, in which case that's always going to be set, or it's not, in which case the effective mode must be S-mode, whether in S-mode or in M-mode via mstatus.MPRV, and for it not to be faulting unconditionally already in the base privileged spec sstatus.SUM must also be set)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right about the SUM bit for sure, so the " not affected by <<sstatusreg_pte,sstatus>>.SUM" is wrong. I think this is the first time we have page bits that only apply in user mode I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For RISC-V I believe so, yes
Fixes #490
@jonwoodruff @nwf @jamie-melling @tomaird