-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: CFI Improvements with PAuth and BTI #17
RFC: CFI Improvements with PAuth and BTI #17
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akirilov-arm I'm really excited about the prospect of having working pointer auth and CFI -- thank you for this effort in that direction!
A few thoughts:
-
It might be good to add a little more detail here to describe proposed changes to the generated code. In particular more detail on the pointer-auth would be helpful, just to document our approach if nothing else.
-
Do either of these features (especially pointer-auth) affect the ABI (I don't actually know)? If so, do we need to somehow declare a new ABI variant? I guess this also intersects with the question of how CFI-enabled and non-CFI-enabled code interact within a single process; describing how this looks (JIT code is enhanced as described here but runtime is not built with these features) would be helpful too.
@cfallin Thanks for the feedback! Concerning your general questions:
Sure, I can use a simple CLIF function that just calls another subroutine and returns immediately after that as an example, and present the generated code before and after the CFI enhancements. In the meantime, the article that is mentioned in the text actually provides that level of detail, to help with the discussion before I update the proposal.
The scheme that I have described in the proposal should be mostly backward compatible, that is any part of the program should be able to remain oblivious of whether some code it interacts with uses PAuth and/or BTI or not. The major exceptions are unwinders, but at least the system ones should be prepared to deal with PAuth (assuming that the software environment the application runs in is relatively recent and that the suggested DWARF changes are implemented), as bytecodealliance/wasmtime#3183 has demonstrated in an unpleasant way. Of course, I can add a short paragraph that summarizes the backward compatibility aspect to the proposal. However, it is possible to devise more sophisticated CFI schemes that require the introduction of specialized ABIs and that have a potentially higher performance overhead - for example, Apple's own variant. IMHO we should start with something simple that will allow us to work out the interactions with the rest of the system, while providing a reasonable improvement to CFI. |
It seems like supporting pointer authentication and BTI is largely an implementation detail of the aarch64 backend, and doesn't require new clif instructions or larger compiler/runtime changes (other than support for configuring aarch64-specific codegen options). Is my understanding correct here?
Two questions:
LLVM has an option to generate the pointer auth for leaf functions, do we want to as well? |
749d230
to
0d5eca3
Compare
I think @alexcrichton asked how the Rust compiler planned to use PAuth and BTI during one of the Cranelift meetings - details are in rust-lang/rust#88354; the changes seem to follow what GCC and LLVM are already doing. As for @fitzgen's questions:
Yes, it is, but the main reason I have posted a RFC is to discuss the higher-level points such as whether the proposed hardening makes sense and what the extent of the threat model should be, e.g. we may decide that a CFI mechanism that requires ABI changes, while being beyond the scope of the current proposal, is something that we should definitely consider in the future. The discussion should also inform a potential x86 implementation using CET.
Keeping in mind that PAuth can't be enabled in a granular way (e.g. with IFUNC or a similar mechanism), but it has to be used globally in all functions in order to be effective, the main utility of the However, if we consider the
AFAIK there isn't a convention per se and applications are free to use both keys, but most software is probably going to use the A key because it is the default for static compilers such as GCC. However, one option that has been discussed in the context of programs that use JIT compilation (e.g. dynamic language runtimes) is to use the B key for generated functions and the A key for statically compiled code. Also, the same interoperability considerations with the Rust compiler apply here.
I suppose you mean all leaf functions as opposed to those that are not affected by the optimization introduced by bytecodealliance/wasmtime#2960, which is what the text currently proposes (does LLVM make that distinction)? Again, that would mostly be about being able to accommodate whatever the Rust compiler decides to do. |
I wonder if this might also be relevant for scenarios where |
Some relevant documentation about the choice of using the A or B keys for signing pointers, according to LLVM's (maybe Apple's LLVM fork in particular) documentation of pointer authentication: https://github.com/apple/llvm-project/blob/next/clang/docs/PointerAuthentication.rst#key-assignments In particular, I do confirm seeing B key be used for signing return addresses on Mac M1. If I understand correctly, for the particular case of the return address at least, the selected key has to match what the host system has chosen too. When unwinding the stack on Mac M1 (when creating a backtrace for an error, for instance), the system's libunwind tried to authenticate all return addresses with the B key, including return addresses that had been stored in registers/stack by Cranelift-generated code. Does that sound accurate? If so, would that require having low-level settings that controlled which key is used in which particular context (e.g. "use the B key for signing return addresses"), or is it something that can be configured during unwinding through the use of CFI directives? |
To be precise - this concerns However, thanks for pointing that out because it means that we may need to do something different when targetting macOS. I have to admit that my previous answer (and, indeed, the proposal itself) has been written mostly with Linux in mind.
Yes, it makes sense.
Well, considering the Linux/DWARF case, the key could actually be specified in the CIE via a CIE augmentation string. |
While I was writing the proposal, my assumption was that even if Wasmtime compiled a Wasm module ahead-of-time, the machine on which the generated code executed would usually have the same (or better) capabilities than those enabled during compilation. How important is the opposite use case? @cfallin also expressed some reservations about the overhead of the additional instructions, so machines without PAuth support would probably rather avoid them. Also, this discussion might be relevant in the context of forward-edge CFI using Intel CET, I believe. P.S. What we have been talking about so far is essentially a forward compatibility issue with PAuth, but after experimenting with BTI I have realized that there is another, backward compatibility problem - imagine that a Wasm module is compiled ahead-of-time without any BTI support, so the generated code will not contain any BTI instructions, and then a machine that does support BTI tries to execute the code. The latter will mark the executable memory pages as containing BTI instructions, which will cause crashes because all indirect branches will fault due to the missing instructions. The solution to this issue for static compilers involves a couple of ELF extensions, so we could use something similar for the |
During the last Cranelift project meeting @alexcrichton mentioned that Wasmtime would error out if it tried to load an AOT compiled module that was built with different settings than the ones used by the current environment, which means that supporting the |
Improve control flow integrity for compiled WebAssembly code by utilizing two technologies from the Arm instruction set architecture - Pointer Authentication and Branch Target Identification. Copyright (c) 2021, Arm Limited.
0d5eca3
to
ba9ad9c
Compare
I have updated the proposal text to align it with the equivalent work on the Rust compiler; also, the changes should address the comments from the discussion above. There is a suggestion that @abrown made elsewhere - since BTI and the |
Motion to finalize with a disposition to mergeAs discussed during the last Cranelift biweekly meeting, I believe that all the comments on the text so far have been addressed, and there hasn't been any strong opposition, so I'm proposing that we merge this RFC. Thanks everyone for the discussion and the feedback! Stakeholders sign-offArmDFINITYEmbark StudiosFastly
FermyonGoogle / EnvoyIBMIntelMicrosoftMozillawasmCloudUnaffiliated |
Entering Final Comment PeriodNow that we have sign-offs from multiple separate groups, as per the process this RFC will move into a final comment period (FCP). The FCP will end on Monday, April 18. |
The FCP has ended without any objections raised and without further discussion, so it is time to merge this RFC proposal. Thanks everyone for the discussion! |
This RFC proposes to improve control flow integrity for compiled WebAssembly code by utilizing two technologies from the Arm instruction set architecture - Pointer Authentication and Branch Target Identification.
Rendered