Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOST_RUNTIME_CONTRACT has invalid value on custom .NET runtime host on Linux #97086

Closed
Tracked by #22
roflmuffin opened this issue Jan 17, 2024 · 8 comments · Fixed by #97891
Closed
Tracked by #22

HOST_RUNTIME_CONTRACT has invalid value on custom .NET runtime host on Linux #97086

roflmuffin opened this issue Jan 17, 2024 · 8 comments · Fixed by #97891
Milestone

Comments

@roflmuffin
Copy link

Description

I am the maintainer of a project (CounterStrikeSharp) which embeds the .NET runtime into a Counter-Strike 2 game server as a way for script authors to modify game server code. It currently supports Linux & Windows on 64 bit systems, and has been currently functioning fine with the .NET 7 CLR. It is worth noting that we ship the entire .NET runtime i.e. by extracting this linked ASP.NET runtime tar.gz with our release builds, so the host is running completely from our own directory.

We have tried recently to upgrade to .NET 8, however the Core CLR now crashes (only on Linux), when calling the hostfxr_get_runtime_delegate method (seen here)

Reproduction Steps

Reproduction is quite hard given the extenuating circumstances of our native host, requiring a running CS2 server to reproduce.

Trace:

#0  0x00007fffc43aae1f in ?? () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/libcoreclr.so
#1  0x00007fffc43aa878 in coreclr_initialize () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/libcoreclr.so
#2  0x00007fffdb87c0a5 in ?? () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/libhostpolicy.so
#3  0x00007fffdb8971ee in ?? () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/libhostpolicy.so
#4  0x00007fffdb8d6e98 in ?? () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/host/fxr/8.0.1/libhostfxr.so
#5  0x00007fffdb8d1c44 in hostfxr_get_runtime_delegate () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/host/fxr/8.0.1/libhostfxr.so
#6  0x00007fffc4abbee7 in CDotNetManager::Initialize() () from /home/steam/game/game/csgo/addons/counterstrikesharp/bin/linuxsteamrt64/counterstrikesharp.so

Expected behavior

.NET runtime loads and retrieves the managed function pointer successfully without crashing

Actual behavior

.NET runtime causes a segfault when trying to call hostfxr_get_runtime_delegate

Regression?

This was working correctly for us in .NET 7.0.11

Known Workarounds

No response

Configuration

Version: .NET 8.0.1
Linux: Tested on Fedora 38, multiple Linux users have reported the issue
Arch: x64

Other information

Running on Windows & Linux respectively, with COREHOST_TRACE=1 in environment variables, we see the following output before the crash:

Windows:

Property NATIVE_DLL_SEARCH_DIRECTORIES = ;G:\cs2\game\csgo\addons\counterstrikesharp\dotnet\shared\Microsoft.NETCore.App\8.0.1\;
Property PLATFORM_RESOURCE_ROOTS = ;
Property APP_CONTEXT_BASE_DIRECTORY = 
Property APP_CONTEXT_DEPS_FILES = G:\cs2\game\csgo\addons\counterstrikesharp\dotnet\shared\Microsoft.NETCore.App\8.0.1\Microsoft.NETCore.App.deps.json
Property PROBING_DIRECTORIES = 
Property RUNTIME_IDENTIFIER = win-x64
Property System.Reflection.Metadata.MetadataUpdater.IsSupported = false
Property System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization = false
Property HOST_RUNTIME_CONTRACT = 0x285916ded88

Linux:

Property System.Reflection.Metadata.MetadataUpdater.IsSupported = false
Property RUNTIME_IDENTIFIER = linux-x64
Property HOST_RUNTIME_CONTRACT = 0x
Property System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization = false
Property FX_DEPS_FILE = /home/michael/Steam/cs2-ds/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/Microsoft.NETCore.App.deps.json
Property APP_CONTEXT_DEPS_FILES = /home/michael/Steam/cs2-ds/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/Microsoft.NETCore.App.deps.json
Property APP_CONTEXT_BASE_DIRECTORY =
Property PLATFORM_RESOURCE_ROOTS = :
Property PROBING_DIRECTORIES =

The only thing worth noting is that HOST_RUNTIME_CONTRACT is set to 0x on the Linux build. After further investigation, the line that appears to be causing the crash is this line in mscoree/exports.cpp.

One of our users has found the pointer value from the ptr_stream located here, and manually set it later in the startup here and that does allow the runtime to startup, though I am not sure why this value is never passed through correctly.

Please let me know if there is anything else we can provide to help provide more context.

We are tracking the issue in our repo here: roflmuffin/CounterStrikeSharp#260

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 17, 2024
@ghost
Copy link

ghost commented Jan 17, 2024

Tagging subscribers to this area: @vitek-karas, @agocke, @VSadov
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

I am the maintainer of a project (CounterStrikeSharp) which embeds the .NET runtime into a Counter-Strike 2 game server as a way for script authors to modify game server code. It currently supports Linux & Windows on 64 bit systems, and has been currently functioning fine with the .NET 7 CLR. It is worth noting that we ship the entire .NET runtime i.e. by extracting this linked ASP.NET runtime tar.gz with our release builds, so the host is running completely from our own directory.

We have tried recently to upgrade to .NET 8, however the Core CLR now crashes (only on Linux), when calling the hostfxr_get_runtime_delegate method (seen here)

Reproduction Steps

Reproduction is quite hard given the extenuating circumstances of our native host, requiring a running CS2 server to reproduce.

Trace:

#0  0x00007fffc43aae1f in ?? () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/libcoreclr.so
#1  0x00007fffc43aa878 in coreclr_initialize () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/libcoreclr.so
#2  0x00007fffdb87c0a5 in ?? () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/libhostpolicy.so
#3  0x00007fffdb8971ee in ?? () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/libhostpolicy.so
#4  0x00007fffdb8d6e98 in ?? () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/host/fxr/8.0.1/libhostfxr.so
#5  0x00007fffdb8d1c44 in hostfxr_get_runtime_delegate () from /home/steam/game/game/csgo/addons/counterstrikesharp/dotnet/host/fxr/8.0.1/libhostfxr.so
#6  0x00007fffc4abbee7 in CDotNetManager::Initialize() () from /home/steam/game/game/csgo/addons/counterstrikesharp/bin/linuxsteamrt64/counterstrikesharp.so

Expected behavior

.NET runtime loads and retrieves the managed function pointer successfully without crashing

Actual behavior

.NET runtime causes a segfault when trying to call hostfxr_get_runtime_delegate

Regression?

This was working correctly for us in .NET 7.0.11

Known Workarounds

No response

Configuration

Version: .NET 8.0.1
Linux: Tested on Fedora 38, multiple Linux users have reported the issue
Arch: x64

Other information

Running on Windows & Linux respectively, with COREHOST_TRACE=1 in environment variables, we see the following output before the crash:

Windows:

Property NATIVE_DLL_SEARCH_DIRECTORIES = ;G:\cs2\game\csgo\addons\counterstrikesharp\dotnet\shared\Microsoft.NETCore.App\8.0.1\;
Property PLATFORM_RESOURCE_ROOTS = ;
Property APP_CONTEXT_BASE_DIRECTORY = 
Property APP_CONTEXT_DEPS_FILES = G:\cs2\game\csgo\addons\counterstrikesharp\dotnet\shared\Microsoft.NETCore.App\8.0.1\Microsoft.NETCore.App.deps.json
Property PROBING_DIRECTORIES = 
Property RUNTIME_IDENTIFIER = win-x64
Property System.Reflection.Metadata.MetadataUpdater.IsSupported = false
Property System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization = false
Property HOST_RUNTIME_CONTRACT = 0x285916ded88

Linux:

Property System.Reflection.Metadata.MetadataUpdater.IsSupported = false
Property RUNTIME_IDENTIFIER = linux-x64
Property HOST_RUNTIME_CONTRACT = 0x
Property System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization = false
Property FX_DEPS_FILE = /home/michael/Steam/cs2-ds/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/Microsoft.NETCore.App.deps.json
Property APP_CONTEXT_DEPS_FILES = /home/michael/Steam/cs2-ds/game/csgo/addons/counterstrikesharp/dotnet/shared/Microsoft.NETCore.App/8.0.1/Microsoft.NETCore.App.deps.json
Property APP_CONTEXT_BASE_DIRECTORY =
Property PLATFORM_RESOURCE_ROOTS = :
Property PROBING_DIRECTORIES =

The only thing worth noting is that HOST_RUNTIME_CONTRACT is set to 0x on the Linux build. After further investigation, the line that appears to be causing the crash is this line in mscoree/exports.cpp.

One of our users has found the pointer value from the ptr_stream located here, and manually set it later in the startup here and that does allow the runtime to startup, though I am not sure why this value is never passed through correctly.

Please let me know if there is anything else we can provide to help provide more context.

We are tracking the issue in our repo here: roflmuffin/CounterStrikeSharp#260

Author: roflmuffin
Assignees: -
Labels:

area-Host, untriaged

Milestone: -

@nothingTVatYT
Copy link

We have a very similar issue with the Flax engine but I found that the property value is not 0 but a weirdly formatted address:

(lldb) p hostContractLocal->bundle_probe
error: Couldn't apply expression side effects : Couldn't dematerialize a result variable: couldn't read its memory
(lldb) p propertyIndex
(int) $19 = 1
(lldb) p propertyValuesW[propertyIndex]
(LPCWSTR) $20 = 0x00005555594bbdd0 u"0x555,555,9e9,628"
(lldb) p propertyKeys[1]
(const char *) $21 = 0x0000555559495b30 "HOST_RUNTIME_CONTRACT"

This looked awfully familiar to a number formatted by a locale setting so I tried to run the same binary with LC_NUMERIC="" LANG="C" and indeed the offending code won't throw a segfault anymore.

Maybe the 0x you're seeing is a failed attempt to format an address and this is the same problem?

Anyway, passing memory addresses by strings is weird enough but at least it shouldn't try to format it using a locale setting.

@elinor-fung
Copy link
Member

it shouldn't try to format it using a locale setting

Thanks for the investigation here. This was changed such that it should no longer do this (#95801). If this is the issue, we may want to backport to 8.

@roflmuffin / @nothingTVatYT would there be any way to check your scenario against a .NET 9 build from https://github.com/dotnet/installer?

@nothingTVatYT
Copy link

@roflmuffin / @nothingTVatYT would there be any way to check your scenario against a .NET 9 build from https://github.com/dotnet/installer?

You mean a build using main or should I try a certain branch?

@nothingTVatYT
Copy link

It's not exactly easy to check it completely but what I did is:
I run the compiled version of the Flax editor with a built debug dotnet 9 runtime and although it breaks I think it's passed the point where it breaks with dotnet 8.

So the fix might work but I still see the string to uint64 pointer in exports.cpp.

The stack trace I get is:

Process 1612878 stopped
* thread #1, name = 'FlaxEditor', stop reason = signal SIGTRAP
    frame #0: 0x00007fff5dc64a8d libcoreclr.so`DBG_DebugBreak at debugbreak.S:9
   6   
   7    LEAF_ENTRY DBG_DebugBreak, _TEXT
   8            int3
-> 9            ret
   10   LEAF_END_MARKED DBG_DebugBreak, _TEXT
   11  
(lldb) bt
* thread #1, name = 'FlaxEditor', stop reason = signal SIGTRAP
  * frame #0: 0x00007fff5dc64a8d libcoreclr.so`DBG_DebugBreak at debugbreak.S:9
    frame #1: 0x00007fff5dbcd3bb libcoreclr.so`::DebugBreak() at debug.cpp:406:9
    frame #2: 0x00007fff5d9da341 libcoreclr.so`CHECK::Setup(this=0x00007fffffffc308, message="Managed object size does not match unmanaged object size\nman: 0x38, unman: 0x20, Name: System.Reflection.RuntimeModule\n", condition="size == expectedsize", file="/home/me/git/dotnet9-runtime/src/coreclr/vm/binder.cpp", line=586) at check.cpp:195:9
    frame #3: 0x00007fff5d33eb85 libcoreclr.so`CoreLibBinder::Check(this=0x00007fff5dd314b8) at binder.cpp:584:13
    frame #4: 0x00007fff5d304051 libcoreclr.so`SystemDomain::LoadBaseSystemClasses(this=0x00007fff5dd2f780) at appdomain.cpp:1421:19
    frame #5: 0x00007fff5d303446 libcoreclr.so`SystemDomain::Init(this=0x00007fff5dd2f780) at appdomain.cpp:1146:5
    frame #6: 0x00007fff5dba4b3e libcoreclr.so`EEStartupHelper() at ceemain.cpp:917:33
    frame #7: 0x00007fff5dba61d4 libcoreclr.so`EEStartup()::$_1::operator()(this=0x00007fffffffcd08, p=0x0000000000000000) const at ceemain.cpp:1053:9
    frame #8: 0x00007fff5dba378b libcoreclr.so`EEStartup() at ceemain.cpp:1055:5
    frame #9: 0x00007fff5dba3572 libcoreclr.so`EnsureEEStarted() at ceemain.cpp:299:17
    frame #10: 0x00007fff5d3a3392 libcoreclr.so`CorHost2::Start(this=0x00005555579fc9b0) at corhost.cpp:100:14
    frame #11: 0x00007fff5d2fcb83 libcoreclr.so`coreclr_initialize(exePath="/home/me/Flax/FlaxEngine/Binaries/Editor/Linux/Development/FlaxEngine.CSharp.dll", appDomainFriendlyName="clrhost", propertyCount=9, propertyKeys=0x0000555557a4c950, propertyValues=0x00005555579f7b10, hostHandle=0x00007fffffffd1a0, domainId=0x00007fffffffd19c) at exports.cpp:310:16
    frame #12: 0x00007fffcdea355b libhostpolicy.so`coreclr_t::create(libcoreclr_path="/usr/share/dotnet/shared/Microsoft.NETCore.App/9.0.0-alpha.1.23614.10/", exe_path="/home/me/Flax/FlaxEngine/Binaries/Editor/Linux/Development/FlaxEngine.CSharp.dll", app_domain_friendly_name="clrhost", properties=0x00005555559624e8, inst=nullptr) at coreclr.cpp:72:10
    frame #13: 0x00007fffcded3b0f libhostpolicy.so`(anonymous namespace)::create_coreclr() at hostpolicy.cpp:75:23
    frame #14: 0x00007fffce4c07eb libhostfxr.so`fx_muxer_t::load_runtime(context=0x00005555557d82a0) at fx_muxer.cpp:843:14
    frame #15: 0x00007fffce4b8e1d libhostfxr.so`hostfxr_get_runtime_delegate(host_context_handle=0x00005555557d82a0, type=hdt_get_function_pointer, delegate=0x00007fffffffd340) at hostfxr.cpp:714:22
    frame #16: 0x00007ffff6bfc915 libFlaxEditor.so`InitHostfxr() at DotNet.cpp:1786:10 [opt]
    frame #17: 0x00007ffff6bfbfad libFlaxEditor.so`MCore::LoadEngine() at DotNet.cpp:266:9 [opt]
    frame #18: 0x00007ffff6bdd0c2 libFlaxEditor.so`ScriptingService::Init(this=<unavailable>) at Scripting.cpp:132:9 [opt]
    frame #19: 0x00007ffff6df44b8 libFlaxEditor.so`EngineService::OnInit() at EngineService.cpp:94:22 [opt]
    frame #20: 0x00007ffff6df9400 libFlaxEditor.so`Engine::Main(cmdLine=<unavailable>) at Engine.cpp:146:5 [opt]
    frame #21: 0x0000555555555f0a FlaxEditor`main(argc=<unavailable>, argv=<unavailable>) at main.cpp:21:12 [opt]
    frame #22: 0x00007ffff5845cd0 libc.so.6`___lldb_unnamed_symbol3187 + 128
    frame #23: 0x00007ffff5845d8a libc.so.6`__libc_start_main + 138
    frame #24: 0x0000555555555cb5 FlaxEditor`_start + 37

@nothingTVatYT
Copy link

I think at this point I can confirm the fix is working. In another test I cloned the dotnet-installer project, built a dotnet9 runtime, installed it in /usr/share/dotnet and tried to run the Flax game editor.
Although there are errors because the Flax editor is not expecting a dotnet 9 runtime at this point we passed the previous issue of parsing a formatted string back into a memory pointer and the host is initialized.

@elinor-fung
Copy link
Member

Thanks, @nothingTVatYT! I will look at backporting.

@elinor-fung elinor-fung added this to the 8.0.x milestone Feb 2, 2024
@elinor-fung elinor-fung removed the untriaged New issue has not been triaged by the area owner label Feb 2, 2024
@elinor-fung
Copy link
Member

This should be addressed in 8.0.3 with #97891

@github-actions github-actions bot locked and limited conversation to collaborators Mar 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Archived in project
3 participants