Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Big Sur] open() vs SA_RESTART #48663

Closed
lambdageek opened this issue Feb 23, 2021 · 6 comments · Fixed by #56403
Closed

[Big Sur] open() vs SA_RESTART #48663

lambdageek opened this issue Feb 23, 2021 · 6 comments · Fixed by #56403

Comments

@lambdageek
Copy link
Member

lambdageek commented Feb 23, 2021

It appears that on MacOS Big Sur, open can sometimes return EINTR if it is interrupted by a signal even if the signal has a handler installed with SA_RESTART.

  • MonoVM thread suspend/resume sometimes (when using posix instead of Mach) will use suspend/resume signals with the SA_RESTART flags. (although this is not the default - we prefer to use Mach suspend/resume operations)
  • The System.Native PAL installs some signal handlers with SA_RESTART

We should make sure that all our calls to open handle EINTR.


To be clear, I haven't seen this cause a problem for .NET that can be clearly attributed to this issue. Until we catch it in a repro, this is just a placeholder issue - unless we want to be proactive about mitigating.


(This is affecting other software too. git, for example. It has affected some runtime engineers - git clone --progress https://github.com/dotnet/runtime will sometimes fail with interrupted system call errors and leave the working tree in an unusable state. The workaround is to call git clone --quiet (note that --progress is the default))


The issue seems to be exacerbated by some antivirus software, even if "real time protection" is turned off.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-PAL-coreclr untriaged New issue has not been triaged by the area owner labels Feb 23, 2021
@akoeplinger
Copy link
Member

Maybe related to #47584 ?

@lewing
Copy link
Member

lewing commented Feb 24, 2021

Here is the workaround going into git https://lore.kernel.org/git/[email protected]/

@akoeplinger
Copy link
Member

Nice. I wonder if we should at least file a radar with Apple?

lambdageek added a commit to lambdageek/mono that referenced this issue Apr 29, 2021
Related to mono#21040 and
dotnet/runtime#48663

On MacOS Big Sur open() is much more likely to throw EINTR - and moreover if
the thread receives a signal while doing an open() even if the signal handler
is marked with SA_RESTART, the open call appears not to restart and to fail
anyway.

So wrap all the calls to open() in retry loops.
lambdageek added a commit to lambdageek/mono that referenced this issue Apr 29, 2021
Related to mono#21040 and
dotnet/runtime#48663

On MacOS Big Sur open() is much more likely to throw EINTR - and moreover if
the thread receives a signal while doing an open() even if the signal handler
is marked with SA_RESTART, the open call appears not to restart and to fail
anyway.

So wrap all the calls to open() in retry loops.
lambdageek added a commit to mono/mono that referenced this issue Apr 30, 2021
Related to #21040 and
dotnet/runtime#48663

On MacOS Big Sur open() is much more likely to throw EINTR - and moreover if
the thread receives a signal while doing an open() even if the signal handler
is marked with SA_RESTART, the open call appears not to restart and to fail
anyway.

So wrap all the calls to open() in retry loops.
lambdageek added a commit to mono/mono that referenced this issue Apr 30, 2021
Related to #21040 and
dotnet/runtime#48663

On MacOS Big Sur open() is much more likely to throw EINTR - and moreover if
the thread receives a signal while doing an open() even if the signal handler
is marked with SA_RESTART, the open call appears not to restart and to fail
anyway.

So wrap all the calls to open() in retry loops.
@mangod9
Copy link
Member

mangod9 commented Jul 4, 2021

Hi @lambdageek, is this a mono specific tracking issue?

@lambdageek
Copy link
Member Author

lambdageek commented Jul 6, 2021

@mangod9 No. It's a potential issue for CoreCLR, Mono and the System.Native libraries interop code. All calls to open in dotnet/runtime should be audited.

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2021
@mangod9 mangod9 added this to the 6.0.0 milestone Jul 6, 2021
@janvorli
Copy link
Member

I think it would not hurt to wrap all calls to open in a loop checking for EINTR. I'll make a PR for the non-mono parts.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 27, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 28, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants