Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The CLR fails sporadically in the Linux sandbox, likely due to a race condition in /tmp #45

Open
j3parker opened this issue Sep 14, 2019 · 0 comments
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@j3parker
Copy link
Member

j3parker commented Sep 14, 2019

If I repeatedly run this command (e.g. inside the /examples folder) on Linux:

bazel clean && bazel build ...

It fails sometimes like this:

ERROR: /home/jacob/rules_csharp/examples/diamond/BUILD:29:1: Compiling Bottom failed (Killed): dotnet failed: error executing command external/netcore-runtime-linux/dotnet external/csharp-build-tools/tasks/netcoreapp2.1/bincore/csc.dll /noconfig /unsafe- /checked- /nostdlib+ /utf8output /deterministic+ /filealign:512 /nologo ... (remaining 12 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox dotnet failed: error executing command external/netcore-runtime-linux/dotnet external/csharp-build-tools/tasks/netcoreapp2.1/bincore/csc.dll /noconfig /unsafe- /checked- /nostdlib+ /utf8output /deterministic+ /filealign:512 /nologo ... (remaining 12 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
Failed to create CoreCLR, HRESULT: 0x80004005

Searching that CoreCLR error got me to https://github.com/dotnet/cli/issues/1488 which talks a bit this happening when tmp doesn't exist/isn't writable (not the case for us) but also this general note from a maintainer:

Getting Failed to initialize CoreCLR, HRESULT: 0x80004005 means that the initialization has failed at very early stages

Searching around for stuff about Bazel's Linux sandbox (which I got investigating because of the extra output from adding --sandbox_debug) and tmp I found this issue about Java: bazelbuild/bazel#3236. In this issue they discover that the Java tools have some manner of race condition involving files in /tmp. The recommended fix is to add --sandbox_tmpfs_path=/tmp, which will give each action it's own tmp dir. If I do this the errors go away.

We should add a note to the documentation that this is currently required for Linux. I plan to dig into this a bit more and file an issue somewhere in the dotnet org because it seems like its also a bug on their end.

@j3parker j3parker added this to the Generally usable milestone Sep 21, 2019
@j3parker j3parker added bug Something isn't working documentation Improvements or additions to documentation labels Sep 21, 2019
@omsmith omsmith mentioned this issue Sep 24, 2019
j3parker added a commit to j3parker/rules_csharp that referenced this issue Sep 24, 2019
--sandbox_tmpfs_path=/tmp removes the flake we see in Linux due to what
is probably a bug in the dotnet runtime.

This is being tracked in issue Brightspace#45.
j3parker added a commit that referenced this issue Sep 24, 2019
--sandbox_tmpfs_path=/tmp removes the flake we see in Linux due to what
is probably a bug in the dotnet runtime.

This is being tracked in issue #45.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant