Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Failed to initialize CoreCLR, HRESULT: 0x80004005" when $TMPDIR is not writeable/nonexistent #3168

Closed
jacobrillema opened this issue May 14, 2018 · 33 comments
Milestone

Comments

@jacobrillema
Copy link

Steps to reproduce

sudo apt-get install dotnet-sdk-2.1.300-rc1-008673
dotnet --version

sudo dotnet --version

Expected behavior

dotnet --version
2.1.300-rc1-008673

sudo dotnet --version
2.1.300-rc1-008673

Actual behavior

dotnet --version
Failed to initialize CoreCLR, HRESULT: 0x80004005

sudo dotnet --version
2.1.300-rc1-008673

Environment data

dotnet --info
Failed to initialize CoreCLR, HRESULT: 0x80004005

Host (useful for support):
  Version: 2.1.0-rc1
  Commit:  eb9bc92051

.NET Core SDKs installed:
  2.1.300-rc1-008673 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.0-rc1-final [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.0-rc1-final [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.0-rc1 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:        16.04
Codename:       xenial

uname -a
Linux BSS-IMM 4.4.0-17134-Microsoft dotnet/core-setup#48-Microsoft Fri Apr 27 18:06:00 PST 2018 x86_64 x86_64 x86_64 GNU/Linux
@danmoseley
Copy link
Member

So it works fine under sudo? That's interesting. If you use strace is it possible to see what we are failing to access?

@jacobrillema
Copy link
Author

jacobrillema commented May 14, 2018

I just tried running 2.1.200 SDK install and it works fine without sudo. I ran the strace again on the RC candidate and here is the area I think it is failing on.

strace -f -o logfile dotnet --version
4591 14020 unlink("/mnt/c/Users/jrillema/AppData/Local/Temp/clr-debug-pipe-14020-209726-in") = -1 ENOENT (No such file or directoy)
4592 14020 mknod("/mnt/c/Users/jrillema/AppData/Local/Temp/clr-debug-pipe-14020-209726-in", S_IFIFO|0700) = -1 EPERM (Operation not permitted)
4593 14020 futex(0x7fc55a725680, FUTEX_WAKE_PRIVATE, 2147483647) = 0
4594 14020 write(2, "Failed to initialize CoreCLR, HR"..., 49) = 49

So that local temp directory is a Windows symbolic link to another drive. Could that be causing the issues? I will try and put that back to a local folder and try again.

@danmoseley
Copy link
Member

@jacobrillema would it be possible to write a little C program that did just mknod("/mnt/c/Users/jrillema/AppData/Local/Temp/clr-debug-pipe-14020-209726-in", S_IFIFO|0700) ?

If you set the env variable as noted in dotnet/coreclr#15878 it should work around this.

Related discussion https://github.com/dotnet/coreclr/issues/8844

I don't see any note of EPERM + mknod on WSL issues: https://github.com/Microsoft/WSL/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+mknod

cc @mikem8361 fyi.

@jacobrillema
Copy link
Author

@danmosemsft Thanks for the all your feedback and the trick with the environment variable fixed it.

I will give the small C program a try and report back. i tried flipping the temp folder back to a local directory and not a symbolic link and I still got the same error reported in the strace log file.

@danmoseley
Copy link
Member

Maybe try your C program against that location and also another one on your Windows disk that doesn't involve a symbolic link.

Depending on what you find you'll probably want to open a bug in https://github.com/Microsoft/WSL. Please include the repro app, info about how to set up the file system to repro it, and the relevant output of strace. Plus your versions of course.

@danmoseley
Copy link
Member

@jkotas another case where I wish customers had something better than Failed to initialize CoreCLR, HRESULT: 0x80004005 Even if it's just a code indicating what phase it failed in that they can search for.

@jkotas
Copy link
Member

jkotas commented May 14, 2018

That is tracked by https://github.com/dotnet/coreclr/issues/9805

@jacobrillema
Copy link
Author

jacobrillema commented May 15, 2018

@danmosemsft Couple updates here. So i noticed in the strace log file when I run the dotnet --version command with sudo the pipe gets created in /tmp, not the user's local temp folder. When i run the following mkfifo command I can confirm this issue.

cd /mnt/c/Users/jrillema/AppData/Local/Temp

pwd
/mnt/c/Users/jrillema/AppData/Local/Temp

mkfifo testfifo
mkfifo: cannot create fifo 'testfifo': Operation not permitted

cd /tmp

mkfifo testfifo

ls -ltr testfifo
prw-rw-rw- 1 jrillema jrillema 0 May 15 00:33 testfifo

So the problem here is out of the box with the new RC this is broken on WSL. In the latest 2.1.200 SDK it still works. I will create a case on the WSL team but I am concerned that out of the box this will be problematic for WSL users when 2.1 SDK drops.

The workaround that you mentioned before still works so I will go with that for now. Please let me know if you need any more information from this end. Thanks again for all your help on this one!

@danmoseley
Copy link
Member

@Anipik can you please try to repro this? You should use a clean WSL instance.

@Anipik
Copy link
Contributor

Anipik commented May 15, 2018

@danmosemsft okay i am on it

@Anipik
Copy link
Contributor

Anipik commented May 15, 2018

@danmosemsft i was not able to repro it on a clean WSL.
Steps that i follwed after enabling wsl on my machine

wget -q packages-microsoft-prod.deb https://packages.microsoft.com/config/ubuntu/16.04/packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
sudo apt-get update
sudo apt-get install dotnet-sdk-2.1.300-rc1-008673

Output

anipik@MININT-DK46LQL:~$ sudo dotnet --version
2.1.300-rc1-008673 
anipik@MININT-DK46LQL:~$ dotnet --version
2.1.300-rc1-008673

Environment Data

anipik@MININT-DK46LQL:~$ dotnet --info
.NET Core SDK (reflecting any global.json):
 Version:   2.1.300-rc1-008673
 Commit:    f5e3ddbe73

Runtime Environment:
 OS Name:     ubuntu
 OS Version:  16.04
 OS Platform: Linux
 RID:         ubuntu.16.04-x64
 Base Path:   /usr/share/dotnet/sdk/2.1.300-rc1-008673/

Host (useful for support):
  Version: 2.1.0-rc1
  Commit:  eb9bc92051

.NET Core SDKs installed:
  2.1.300-rc1-008673 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.0-rc1-final [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.0-rc1-final [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.0-rc1 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download

anipik@MININT-DK46LQL:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:        16.04
Codename:       xenial

Difference

The only different part in these is

.NET Core SDK (reflecting any global.json):
 Version:   2.1.300-rc1-008673
 Commit:    f5e3ddbe73

@jacobrillema
Copy link
Author

What Windows version are you running? Also if you run a strace on the commands is the pipe getting created in /tmp or the user's local temp folder?

Microsoft Windows [Version 10.0.17134.48]

@jkotas
Copy link
Member

jkotas commented May 15, 2018

It looks like the error handling have changed in .NET Core 2.1:

dotnet --version
export TMPDIR=/nonexistent
dotnet --version

In .NET Core 2.0, the bogus TMPDIR was silently ignored.
In .NET Core 2.1, the bogus TMPDIR causes hard failure.

@jacobrillema
Copy link
Author

Ah that is it. I have the following.

dotnet --version
Failed to initialize CoreCLR, HRESULT: 0x80004005

echo $TMPDIR
/mnt/c/Users/jrillema/AppData/Local/Temp

export TMPDIR=/tmp
dotnet --version
2.1.300-rc1-008673

If I change it to /tmp it works! Nice catch @jkotas and thanks for the feedback. I tried this with not opting out of the COMPlus_EnableDiagnostics and it is still working.

Not sure how we should handle that or how to document it. I guess the ownership of the issue is with WSL team? I will update their issue to indicate why this showed up now with .NET Core SDK.

Thanks everybody for your help on this one!

@jkotas
Copy link
Member

jkotas commented May 15, 2018

@mikem8361 Was the change from silently ignoring the failure to create debug pipes to hard error intentional?

@danmoseley
Copy link
Member

@jacobrillema can you clarify, in your repro case, you have $TMPDIR set to /mnt/c/Users/jrillema/AppData/Local/Temp, did that not exist? As I see higher up that you cd'd to it. If it does exist, then that suggests that a bad $TMPDIR was not the cause in your case?

Or is this the case

  • mknod in WSL against a Windows folder fails and possibly has always failed
  • your TMPDIR was pointed to the Windows folder
  • recently, we begun failing if the mknod failed against TMPDIR

?

@mikem8361
Copy link
Member

I'm late to this issue thread, but I would like to confirm that we do honor the TMPDIR environment variable in 2.1, but not in 2.0 and below. Defaults to /tmp if the env var doesn't exist. We assume that any file or pipe, etc. can be created (mkfifo) in the tmp dir. This change was made in commit hash #32008c90.

It looks like we have always failed initializing if the debugger transport fails (if the pipe creation fails on Linux). @jkotas, unless I missed something in the history, coreclr never silently failed when creating the debugger pipes.

@jacobrillema
Copy link
Author

@danmosemsft

I had the following in my .zshrc file to help with interop for launching a git diff session using Beyond Compare from WSL. The directory did exist (and was a Windows symlink) and was accessible from WSL.

export TMPDIR='/mnt/c/Users/jrillema/AppData/Local/Temp'
alias gdiff='git difftool -y --dir-diff --no-symlinks'
alias bcomp="BComp.exe"

Here is part of my .gitconfig file.

[diff]
tool = bc4
[difftool]
prompt = false
[difftool "bc4"]
cmd = \"BComp.exe\" -expandall \"`echo $REMOTE | sed 's_/mnt/c_C:_'`\" \"`echo $LOCAL | sed 's_/mnt/c_C:_'`\"
[merge]
tool = bc4
[mergetool]
prompt = false;
[mergetool "bc4"]
cmd = \"BComp.exe\" \"$REMOTE\" \"$LOCAL\" \"$BASE\" \"$MERGED\"

I followed the advice from this url at https://www.sep.com/sep-blog/2017/06/07/20170607wsl-git-and-beyond-compare/

git difftool generates a set of left and right files in a /tmp for use as $LOCAL and $REMOTE in the difftool cmd. We can override the location of /tmp by setting the TMPDIR environment variable. We want to see those files from the Windows file system, so we use the AppData Local Temp folder (accessible via /mnt).

This is correct

mknod in WSL against a Windows folder fails and possibly has always failed
your TMPDIR was pointed to the Windows folder
recently, we begun failing if the mknod failed against TMPDIR

@jacobrillema
Copy link
Author

jacobrillema commented May 16, 2018

FYI I copied this conversation over from the WSL issue created.

That metadata blog post did specify special files like pipes would be supported. So I went back through to see how I was mounting my /mnt/c drive and sure enough I did not opt in to the new metadata drvfs approach. I created a /etc/wsl.conf with the following:

[automount]
enabled = true
options = "metadata,uid=1000,gid=1000,umask=22,fmask=111"
mountFsTab = true

Note I have mountFsTab set so I can access my SD card. So after enabling this metadata and keeping my original TMPDIR set I was able to get pass the critical error from the .NET SDK. See my results below

dotnet --version
2.1.300-rc1-008673

mount -l | grep "C:"
C: on /mnt/c type drvfs (rw,noatime,uid=1000,gid=1000,umask=22,fmask=111,metadata)

echo $TMPDIR
/mnt/c/Users/jrillema/AppData/Local/Temp

cd $TMPDIR
pwd
/mnt/c/Users/jrillema/AppData/Local/Temp

mkfifo testfifo
ls -tlr testfifo
prw-rw-rw- 1 jrillema jrillema 0 May 16 00:37 testfifo

So I was able to get this working with opting into the new drvfs metadata options. Here is the link and search for special files The whole reason I went down this path was to accomodate running a Beyond Compare diff session from inside WSL.

I appreciate everybody's help on this one. I will play around a bit more with testing this the next few days to make sure it is working. I would suspect we might be able to close this out here if it works. I am not sure how new WSL installations are handled with the new metadata by default but I am working with an older WSL installation.

@danmoseley
Copy link
Member

@jkotas do you believe it is worth making CLR start when $TMPDIR is bad? I don't like inexplicable startup failures and this seems to be the only point during startup that if affected.
I don't think the priority is high though

@jkotas
Copy link
Member

jkotas commented May 16, 2018

Yes, I think it would be worth it. Or at least make it easier to diagnose.

@danmoseley danmoseley changed the title Failed to initialize CoreCLR, HRESULT: 0x80004005 (WSL + Ubuntu 16.04 LTS) Will work with sudo "Failed to initialize CoreCLR, HRESULT: 0x80004005" when $TMPDIR is not writeable/nonexistent May 16, 2018
@bdaniel7
Copy link

Hi,

I'm trying to run a TeamCity agent in a container.
This is environment:

The TeamCity and the agent are the latest from the container (2018.1.3 (build 58658))
The container is run in Docker for Windows, using Linux containers (because the production will run on Ubuntu).

root@teamcity-agent3:/opt/buildagent/work/70bf5e8b1cc61bbe/src# dotnet --info
.NET Core SDK (reflecting any global.json):
Version: 2.1.401
Commit: 91b1c13032

Runtime Environment:
OS Name: ubuntu
OS Version: 16.04
OS Platform: Linux
RID: ubuntu.16.04-x64
Base Path: /usr/share/dotnet/sdk/2.1.401/

Host (useful for support):
Version: 2.1.3
Commit: 124038c13e

.NET Core SDKs installed:
2.1.401 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.3 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.3 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 2.1.3 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
https://aka.ms/dotnet-download

When I run the build using

/opt/buildagent/work/70bf5e8b1cc61bbe/src# /usr/bin/dotnet build trains.sln --framework netcoreapp2.1 --configuration Release --runtime ubuntu-x64

in the container, the build is successful.

However when I run the same command as part of a build step, I get this error:

Failed to initialize CoreCLR, HRESULT: 0x80004005

Any ideas on why this error occurs?

@mikem8361
Copy link
Member

You could try disabling debugging with export COMPlus_EnableDiagnostics=0. There are some pipes that that are created in /tmp that this disables.

https://github.com/dotnet/coreclr/blob/master/Documentation/building/debugging-instructions.md#disabling-managed-attachdebugging

@bdaniel7
Copy link

Thanks,

The setting will be in effect immediately, or do I have to reboot the container (or logout, login)?

@mikem8361
Copy link
Member

mikem8361 commented Oct 18, 2018 via email

@bdaniel7
Copy link

Unfortunately, it has no effect.
I already tried setting that variable.

When I log in the container and su buildagent I can use the dotnet commands successfully. But the build step from TeamCity still fails.

@bdaniel7
Copy link

bdaniel7 commented Oct 19, 2018

Actually, it works.
Previously I set the variable in the context of the root, and not in the context of the buildagent user.

What a dummy I am...

@vitek-karas vitek-karas removed their assignment Jun 3, 2019
@lpereira
Copy link
Contributor

lpereira commented Jun 5, 2019

By default, $TMPDIR is empty on WSL (at least in the default profile used by bash in Ubuntu 18.04.1), so dotnet should default to /tmp, which is on a lxfs volume -- thus supporting pipes, FIFOs, and other Unixy things.

Setting $TMPDIR to anything under a drvfs will make dotnet fail with a less than helpful error message (drvfs merely proxies Windows volumes to the WSL world, so I don't think this will be supported even under WSL2 -- don't quote me on that, I haven't tested it).

Maybe this should be fixed in the documentation instead?

lpereira referenced this issue in lpereira/core-setup Jun 5, 2019
While not an actual fix for #4149, this adds a little bit of robustness
when trying to determine the path to a temporary directory.
lpereira referenced this issue in lpereira/core-setup Jun 5, 2019
While not an actual fix for #4149, this adds a little bit of robustness
when trying to determine the path to a temporary directory.

(is_directory() isn't called for /var/tmp and /tmp because the full
path qualifier, including the trailing slash, are also passed to
realpath(), which should take care of this check.)
@jeffschwMSFT
Copy link
Member

Closing in favor of the PRs to help make this experience better: dotnet/core-setup#6692 and microsoft/WSL#4038

@msftgits msftgits transferred this issue from dotnet/core-setup Jan 30, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 30, 2020
zakarybk added a commit to zakarybk/api that referenced this issue Apr 5, 2020
@Nikita-T86
Copy link

Nikita-T86 commented Oct 15, 2020

Just for the reference - I spent a couple of hours trying to figure out why I have error "Failed to create CoreCLR, HRESULT: 0x80004005" when running dotnet app in my docker container. The reason was that I mounted my host volume to /tmp dir in container, which is apparently used by dotnet runtime.

@Olivier-Levasseur
Copy link

Just for the reference - I spent a couple of hours trying to figure out why I have error "Failed to create CoreCLR, HRESULT: 0x80004005" when running dotnet app in my docker container. The reason was that I mounted my host volume to /tmp dir in container, which is apparently used by dotnet runtime.

so what we have to do to fix the error?

@naveko
Copy link

naveko commented Nov 18, 2020

Also just for reference: I got exactly the same error when trying to build a Docker container, purely for lack of disk space. Ran a 'docker system prune' and a 'docker volume prune' and the error was gone.

@Nikita-T86
Copy link

so what we have to do to fix the error?

Just mount your host volume to some non-existent directory in Docker, like /build, /app, /whatever

@ghost ghost locked as resolved and limited conversation to collaborators Dec 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests