Architecture-dependent Page Table Entry (PTE) flags for `x86_64` and `aarch64` #699

kevinaboos · 2022-11-19T03:44:48Z

For purposes of comparison with #696

~~Remaining TODO: add #[test] cases for the four conversion routines.~~ Done

…4` & `x86_64`

…ation, conversion routines between arch-specific flags and software ones.

NathanRoyer

Looks good to me, other than the two other comments I left on specific lines.

Not so important: PteFlags has the flags that best match x86_64 & aarch64; if we want to support risc-v in the future, would we change it based on flags common to aarch64, x86_64 & risc-v? I guess we'll see in the future

kernel/pte_flags/src/pte_flags_aarch64.rs

kernel/pte_flags/src/lib.rs

kevinaboos · 2022-11-22T02:34:10Z

ok this should be completely ready now. @NathanRoyer please re-review and also see how I did the cool cfg stuff with docs

NathanRoyer · 2022-11-22T09:22:21Z

ok this should be completely ready now. @NathanRoyer please re-review

Your solution for ACCESSED will work. However, regarding PAGE_DESCRIPTOR, I'm not sure I understand why we shouldn't put it by default like ACCESSED because it's also a requirement for things to work... Without PAGE_DESCRIPTOR, all the other flags will be interpreted as if they were in a block descriptor

edit: we solved this during our meeting

also see how I did the cool cfg stuff with docs

Very nice, didn't know about this trick

kernel/pte_flags/src/lib.rs

NathanRoyer · 2022-11-22T12:45:07Z

Also, should we add a dummy PteFlags::is_huge(self) -> bool method? one that would always returns false?
(For the moment I commented out the calls to that method in memory)

NathanRoyer · 2022-11-23T15:09:53Z

By manually setting the bit, the descriptor was accepted into the TLB and I encountered no page fault. Maybe this was because of my configuration of translation control registers though? I'll check. It could also be qemu not implementing FEAT_HAFDBS.

Something I forgot: TCR_EL1 (the translation control register) has two flags that control the behaviour of the CPU regarding the ACCESSED and DIRTY flags: HA (for ACCESSED) and HD (for DIRTY). However, even with these enabled, qemu triggers an access fault when the bit is initially cleared.

Regarding the output address size: I was using 48-bit OA mode in my experimental code; I assumed we could only address up to 48 bits (4 levels * 9 bits + 12 bits).

However, actually one of the following configurations can be selected (using TCR_ELn::IPS and TCR_ELn::DS - weirdly, the latter is not exposed by cortex_a... I assumed the DS field was just not present in TCR_EL1):

As we can see, the 52-bit output address mode is a bit more complex as some bits of the OA are shifted to the middle of lower attributes. This allows the dirty bit to always occupy the 51st bit. In 48-bit OA mode, bits 48 & 49 are reserved and bit 50 is the Guarded Page bit - implementation is optional.

In 52-bit OA mode, the shareability attributes are stored in the TCR_EL1 register, specifcially in field SH0 for TTBR0 page table walks and in field SH1 for TTBR1 page table walks. This limits us to two shareability profiles and also complexifies page table management a bit since that requires us to maintain two page tables at all time.

Apparently long mode x86_64 also supports 52-bit OA mode; I don't know how this is configured however, but maybe you know.

kernel/pte_flags/src/pte_flags_aarch64.rs

kevinaboos · 2022-11-23T20:02:29Z

Also, should we add a dummy PteFlags::is_huge(self) -> bool method? one that would always returns false? (For the moment I commented out the calls to that method in memory)

whoa, definitely not. That would be invalid and could lead to us completely misinterpreting what PTE values mean. For a concrete example of why, in the initial bootstrap assembly code (boot.asm) we actually do use huge pages to efficiently and easily map large regions of the very lowest and very highest addresses before jumping to Rust code. If we attempted to call translate() on one of those addresses early on, it would completely fail because it would unconditionally consider it a non-huge page PTE.

In any case, why do something like that which could confuse the caller? It also acts as a sanity check if we ever do come across a mapping that uses huge pages, or in the future when we add proper support for them.

kevinaboos · 2022-11-23T20:19:40Z

By manually setting the bit, the descriptor was accepted into the TLB and I encountered no page fault. Maybe this was because of my configuration of translation control registers though? I'll check. It could also be qemu not implementing FEAT_HAFDBS.

Something I forgot: TCR_EL1 (the translation control register) has two flags that control the behaviour of the CPU regarding the ACCESSED and DIRTY flags: HA (for ACCESSED) and HD (for DIRTY). However, even with these enabled, qemu triggers an access fault when the bit is initially cleared.

That's fine, it seems that those bits just allow hardware to set the access flag (as is the default behavior on x86) instead of the OS always doing it (?). Looks like the Access Flag Fault always occurs regardless of HA/HD value, so let's just always set the ACCESSED bit for now, but also implement an Access Flag Fault handler such that we know when it happens (and can handle those faults in the future if/when we support swapping pages to disk).

kevinaboos · 2022-11-23T21:36:57Z

Regarding the output address size: I was using 48-bit OA mode in my experimental code; I assumed we could only address up to 48 bits (4 levels * 9 bits + 12 bits).

48-bit is fine with me, we can use that for now and if it presents problems we can move to 52-bit.

As we can see, the 52-bit output address mode is a bit more complex as some bits of the OA are shifted to the middle of lower attributes. This allows the dirty bit to always occupy the 51st bit. In 48-bit OA mode, bits 48 & 49 are reserved and bit 50 is the Guarded Page bit - implementation is optional.

I was wondering about the DIRTY bit, that makes sense. Glad we resolved that.

In 52-bit OA mode, the shareability attributes are stored in the TCR_EL1 register, specifcially in field SH0 for TTBR0 page table walks and in field SH1 for TTBR1 page table walks. This limits us to two shareability profiles and also complexifies page table management a bit since that requires us to maintain two page tables at all time.

We'd only ever use OUTER_SHAREABLE anyway, so that's not a problem. We could just use the same page tables for both TTBRs, i think.

Apparently long mode x86_64 also supports 52-bit OA mode; I don't know how this is configured however, but maybe you know.

We don't use it, but it's much simpler on x86 -- the 52-bit physical address can pretty much always be used.

kevinaboos · 2022-11-23T22:08:49Z

@NathanRoyer I have addressed all of our comments, please double-check everything again.

Important Note: in the memory_structs crate, you will need to implement aarch64-specific canonicalization for 48-bit physical addresses, since they now differ from the 52-bit physical addresses we use on x86_64.
Also, I assume that virtual addresses on aarch64 are also canonicalized in a similar fashion to x86; if not, you'll need to handle that too.
This stuff is on L129-159 of memory_structs/src/lib.rs

NathanRoyer · 2022-11-25T17:04:22Z

@NathanRoyer I have addressed all of our comments, please double-check everything again.

About PteFlagsAarch64::_CONTIGUOUS:

If not set, this translation table is not contiguous with the previous one in memory.

I think it's not necessarily the previous one; from the DDI0487:

The Contiguous bit identifies a descriptor as belonging to a group of adjacent translation table entries that point to a contiguous OA range.

But anyway, the nuance is small and we don't use it so I'll leave it as is.

Important Note: in the memory_structs crate, you will need to implement aarch64-specific canonicalization for 48-bit physical addresses, since they now differ from the 52-bit physical addresses we use on x86_64. Also, I assume that virtual addresses on aarch64 are also canonicalized in a similar fashion to x86; if not, you'll need to handle that too. This stuff is on L129-159 of memory_structs/src/lib.rs

I responded to this on Discord: the most significant bits of an address are typically used for tagged addresses (ASID) or ignored, when that mechanism is disabled.

I will approve this, you can merge it in.

kevinaboos · 2022-11-25T20:14:45Z

About PteFlagsAarch64::_CONTIGUOUS:

If not set, this translation table is not contiguous with the previous one in memory.

I think it's not necessarily the previous one; from the DDI0487:

The Contiguous bit identifies a descriptor as belonging to a group of adjacent translation table entries that point to a contiguous OA range.

Right, thanks for pointing that out. I'll fix it.

…4` (#699) * The `pte_flags` crate will replace `entryflags` and supports both `aarch64` & `x86_64` * There are "lower-level" (architecture-specific) PTE flag types that can be converted to and from a "higher-level" (architecture-independent) `PteFlags` type. * Currently unused in Theseus, but will be used in an upcoming PR. Co-authored-by: Nathan Royer <[email protected]> 29ffb09

* The `pte_flags` crate offers the following improvements: * Builder-style functions to set or clear each flag. * A unified interface for checking or setting PTE flags across all architectures. * The ability to be as generic or as specific to an architecture as you like. * Lossless conversions from the generic `PteFlags` to the arch-specific `PteFlags<Arch>`, and lossy conversions from the arch-specific `PteFlags<Arch>` to the generic `PteFlags`. * See #699 and #712 for more details. * All memory mapping functions now accept the PTE flags parameter as an instance of `Into<PteFlagsArch>`, which allows both arch-generic `PteFlags` and arch-specific `PteFlagsX86_64` or `PteFlagsAarch64` to be used.

* The `pte_flags` crate offers the following improvements: * Builder-style functions to set or clear each flag. * A unified interface for checking or setting PTE flags across all architectures. * The ability to be as generic or as specific to an architecture as you like. * Lossless conversions from the generic `PteFlags` to the arch-specific `PteFlags<Arch>`, and lossy conversions from the arch-specific `PteFlags<Arch>` to the generic `PteFlags`. * See #699 and #712 for more details. * All memory mapping functions now accept the PTE flags parameter as an instance of `Into<PteFlagsArch>`, which allows both arch-generic `PteFlags` and arch-specific `PteFlagsX86_64` or `PteFlagsAarch64` to be used. e7847d5

NathanRoyer added 4 commits November 17, 2022 15:33

The pte_flags crate replaces entryflags and supports both `aarch6…

309471f

…4` & `x86_64`

Added architecture-specific flag abstractions, detailed flag document…

02435c8

…ation, conversion routines between arch-specific flags and software ones.

Small documentation additions

9fbfe96

Small documentation additions

0e049c4

kevinaboos requested a review from NathanRoyer November 19, 2022 03:44

New page table entry (PTE) flags abstraction for x86_64 and aarch64

ba4190c

kevinaboos force-pushed the arch_generic_pte_flags branch from 100a3ed to ba4190c Compare November 19, 2022 03:45

kevinaboos mentioned this pull request Nov 19, 2022

Introduce pte_flags crate with support for aarch64 and x86_64 #696

Closed

kevinaboos marked this pull request as draft November 19, 2022 03:48

NathanRoyer reviewed Nov 21, 2022

View reviewed changes

kernel/pte_flags/src/pte_flags_aarch64.rs Outdated Show resolved Hide resolved

kernel/pte_flags/src/lib.rs Show resolved Hide resolved

kevinaboos added 3 commits November 21, 2022 18:05

More docs, improve conversion on aarch64

ff47fdf

remove test stuff

e0e985b

fix target-specific docs

216f3d3

kevinaboos marked this pull request as ready for review November 22, 2022 02:33

kevinaboos requested a review from NathanRoyer November 22, 2022 02:35

NathanRoyer reviewed Nov 22, 2022

View reviewed changes

kernel/pte_flags/src/lib.rs Outdated Show resolved Hide resolved

NathanRoyer mentioned this pull request Nov 22, 2022

Porting memory to aarch64 #701

Merged

Add PTE_FRAME_MASK and functions to set DIRTY and ACCESSED bits

d6c51f9

NathanRoyer reviewed Nov 23, 2022

View reviewed changes

kernel/pte_flags/src/pte_flags_aarch64.rs Outdated Show resolved Hide resolved

NathanRoyer reviewed Nov 23, 2022

View reviewed changes

kernel/pte_flags/src/pte_flags_aarch64.rs Outdated Show resolved Hide resolved

Fix aarch64 flags

0c0c2d9

kevinaboos requested a review from NathanRoyer November 23, 2022 22:08

NathanRoyer mentioned this pull request Nov 24, 2022

Theseus on aarch64 - tracking issue #702

Open

36 tasks

NathanRoyer approved these changes Nov 25, 2022

View reviewed changes

docs: clarify what the PteFlagsAarch64::_CONTIGUOUS bit means

1c55bb1

kevinaboos merged commit 29ffb09 into theseus-os:theseus_main Nov 25, 2022

kevinaboos deleted the arch_generic_pte_flags branch November 25, 2022 20:25

NathanRoyer mentioned this pull request Jan 30, 2023

Combine aarch64 code into main repo workspace #817

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture-dependent Page Table Entry (PTE) flags for `x86_64` and `aarch64` #699

Architecture-dependent Page Table Entry (PTE) flags for `x86_64` and `aarch64` #699

kevinaboos commented Nov 19, 2022 •

edited

Loading

NathanRoyer left a comment •

edited

Loading

kevinaboos commented Nov 22, 2022 •

edited by NathanRoyer

Loading

NathanRoyer commented Nov 22, 2022 •

edited

Loading

NathanRoyer commented Nov 22, 2022

NathanRoyer commented Nov 23, 2022

kevinaboos commented Nov 23, 2022

kevinaboos commented Nov 23, 2022

kevinaboos commented Nov 23, 2022

kevinaboos commented Nov 23, 2022

NathanRoyer commented Nov 25, 2022

kevinaboos commented Nov 25, 2022

Architecture-dependent Page Table Entry (PTE) flags for x86_64 and aarch64 #699

Architecture-dependent Page Table Entry (PTE) flags for x86_64 and aarch64 #699

Conversation

kevinaboos commented Nov 19, 2022 • edited Loading

NathanRoyer left a comment • edited Loading

Choose a reason for hiding this comment

kevinaboos commented Nov 22, 2022 • edited by NathanRoyer Loading

NathanRoyer commented Nov 22, 2022 • edited Loading

NathanRoyer commented Nov 22, 2022

NathanRoyer commented Nov 23, 2022

kevinaboos commented Nov 23, 2022

kevinaboos commented Nov 23, 2022

kevinaboos commented Nov 23, 2022

kevinaboos commented Nov 23, 2022

NathanRoyer commented Nov 25, 2022

kevinaboos commented Nov 25, 2022

Architecture-dependent Page Table Entry (PTE) flags for `x86_64` and `aarch64` #699

Architecture-dependent Page Table Entry (PTE) flags for `x86_64` and `aarch64` #699

kevinaboos commented Nov 19, 2022 •

edited

Loading

NathanRoyer left a comment •

edited

Loading

kevinaboos commented Nov 22, 2022 •

edited by NathanRoyer

Loading

NathanRoyer commented Nov 22, 2022 •

edited

Loading