-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use signals for activation injection on macOS #46657
Use signals for activation injection on macOS #46657
Conversation
This change moves macOS activation injection to the signal plan like it works on other Unix platforms. The reason is that the activation injection using thread suspension and thread redirection with helper frame can collide with signal handlers on the same thread and result in a corrupted stack frame. The issue can be reproduced by sending signals to the .NET process from some other process while the .NET process is doing a lot of GCs.
#ifdef TARGET_ARM64 | ||
// RtlRestoreContext assembly corrupts X16 & X17, so it cannot be | ||
// used for Activation restore | ||
MachSetThreadContext(context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is context restored for activation using signals? How are we preventing X16/X17 corruption?
Maybe we are just returning from the signal handler and the kernel is taking care of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am actually wondering if we should change the JIT to not use X17 unless it is marked unsafe for GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context is restored by the kernel. We return from the signal handler and let the kernel do the job. We just copy CONTEXT to the ucontext_t passed to the signal handler before returning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/* A type to wrap the native context type, which is ucontext_t on some | ||
* platforms and another type elsewhere. */ | ||
#if HAVE_UCONTEXT_T | ||
#include <ucontext.h> | ||
#include <sys/ucontext.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janvorli, was this part of the change required for activation via signal?
As far as I can tell, all systems that have sys/ucontext.h
also have toolchain-specific ucontext.h
which includes sys/ucontext.h
, plus some additional defines. This is the case on macOS, Linux and illumos. However, on FreeBSD, it is sys/ucontext.h
which includes machine/ucontext.h
but it was working fine there before with <ucontext.h>
.
I am asking as this is breaking illumos build since Friday. From logs:
2021-01-08T19:02:42.0116455Z [ 40%] Building CXX object pal/src/CMakeFiles/coreclrpal.dir/thread/context.cpp.o
2021-01-08T19:02:42.2653807Z In file included from /runtime/src/coreclr/pal/src/thread/context.cpp:25:
2021-01-08T19:02:42.2655771Z /runtime/src/coreclr/pal/src/thread/context.cpp: In function 'void CONTEXTToNativeContext(const CONTEXT*, native_context_t*)':
2021-01-08T19:02:42.2657287Z /runtime/src/coreclr/pal/src/include/pal/context.h:120:41: error: 'REG_RBP' was not declared in this scope
2021-01-08T19:02:42.2658120Z #define MCREG_Rbp(mc) ((mc).gregs[REG_RBP])
2021-01-08T19:02:42.2658595Z ^~~~~~~
this is because register definitions are included separately after the inclusion of sys/ucontext.h
: https://github.com/illumos/illumos-gate/blob/260693/usr/src/head/ucontext.h#L35-L48.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #46790 with a potential fix. If activation does not particularly require sys/ucontext.h, I can simplify my patch to make it how it was before (#include <ucontext.h>
without the #elif
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this change, it is not compiling on macOS. I am getting:
In file included from /Users/janvorli/git/runtime2/src/coreclr/pal/src/exception/signal.cpp:51:
In file included from /Users/janvorli/git/runtime2/src/coreclr/pal/src/include/pal/context.h:34:
/Users/janvorli/Downloads/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.0.sdk/usr/include/ucontext.h:51:2: error: The
deprecated ucontext routines require _XOPEN_SOURCE to be defined
#error The deprecated ucontext routines require _XOPEN_SOURCE to be defined
^
1 error generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On macos, the ucontext.h contains prototypes for makecontext, swapcontext etc that are deprecated. The sys/ucontext.h contains definition of the actual data structure.
Maybe it would be better to go back to including the <ucontext.h> and just define the _XOPEN_SOURCE symbol before including it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be compiling with SDK 10.4 locally and CI is green in the PR. Let me try with SDK 11.0.
This change moves macOS activation injection to the signal plan like it
works on other Unix platforms. The reason is that the activation injection
using thread suspension and thread redirection with helper frame can
collide with signal handlers on the same thread and result in a corrupted
stack frame. The issue can be reproduced by sending signals to the
.NET process from some other process while the .NET process is doing
a lot of GCs.