Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to support TLS access emitted in ILC #425

Closed
wants to merge 1 commit into from

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Jun 11, 2023

In progress.

These are companion changes for dotnet/runtime#87148 , which is also in progress.

@VSadov
Copy link
Member Author

VSadov commented Jun 13, 2023

I have troubles with building/testing this on osx-arm64

Building succeeds, but there are many warnings about unused assignment to unsigned NumEntries=getNumEntries()
Then I replace the libobjwriter.dylib in the netcore.runtime.objwriter nuget with my locally build binary.

if I try build/test NativeAOT after that the machine runs out of memory. All this happens if my changes are reverted as well.

It looks like I have a newer Xcode/clang on my machine compared to the lab (I have 14.3.1). Not sure if that could be the reason for warnings and misbehaving .dylib

@filipnavara - any ideas what could be wrong?

@filipnavara
Copy link
Member

@filipnavara - any ideas what could be wrong?

I'll give it a try tomorrow in the morning. My last local build was apparently done with Xcode 14.2. I have since updated my machine to 14.3 and 15 Beta, and it definitely runs into some build issues.

@VSadov
Copy link
Member Author

VSadov commented Jun 13, 2023

the offending process appears to be dsymutil when building tests. The test build spawns 3 of them and each takes 30+ Gb

My last local build was apparently done with Xcode 14.2

I will try installing/switching to 14.2 cmd tools. Perhaps that would not have the issue.

@filipnavara
Copy link
Member

filipnavara commented Jun 13, 2023

I built the whole LLVM repo with ./build.sh --os macos --arch arm64 /p:ClangTarget=aarch64-apple-darwin and Xcode 15 Beta now. I used the ObjWriter to build NativeAOT smoke tests and it seemed to work.

but there are many warnings about unused assignment to unsigned NumEntries=getNumEntries()

Also getting those.

the offending process appears to be dsymutil when building tests

That's suspicious for sure. Maybe you can spindump it (from Activity Monitor) to see what is it doing? (There's a global symbol map cache somewhere and it can be corrupted; although I never hit that myself)

@VSadov
Copy link
Member Author

VSadov commented Jun 13, 2023

I've used ./build.sh --ci --restore --build --pack --arch arm64 --configuration Release /p:ClangTarget=aarch64-apple-darwin with or without --ci.
That is what the lab builds seem to use.

Perhaps it is not having --os macos or --configuration Release that causes this. I will try a few combinations.

@VSadov
Copy link
Member Author

VSadov commented Jun 13, 2023

I am building the commit 43fe12a - that was before my changes in the PR.

@VSadov
Copy link
Member Author

VSadov commented Jun 14, 2023

Installed older command line utils. The build no longer warns about NumEntries and the .dylib is slightly larger (so it is different).
The end result is the same - OOMs when building tests.

I suspect something is corrupted on my machine, but no idea what.

Some observations (maybe that is ok, but seems strange):
The locally built objwriter dylib is noticeably larger than the one in the nuget - like 50% larger. On Linux the sizes are much closer.
The dylib in the nuget is dated from February. I assumed we had changes more often than this... Linux counterpart is from June 7.

@filipnavara
Copy link
Member

filipnavara commented Jun 14, 2023

I suspect something is corrupted on my machine, but no idea what.

Apple has unusual way to index debug symbols. In order to save linking time and space they don't copy the DWARF info from .o files into linked binaries. They just assign an UUID to the final binary and maintain a map from the UUID back to the .o files where the debugger find and loads it (and only performs the relocations when actually necessary during debugging). The consequence of this approach is that there are several caches which maintain the maps, and it's not all necessarily obvious or well documented. dsymutil is one of the utilities accessing these caches and resolving the debug symbols. Aside from the various dumping options it's usually used to collect all the DWARF information for a given UUID (extracted from the input file and dumpable with dwarfdump -u <input file>) and produce a bundle that can be transferred to another machine. The bundle has the UUID, and when index by Spotlight on any machine, it can be located back through the metadata utilities - eg. mdfind "com_apple_xcode_dsym_uuids == <uuid>".

Long story short... if you can figure out which dsymutil process is hanging and what was the input, you can try to get the UUID of the input file (dwarfdump -u), locate the debug symbols (mdfind, dsymutil --dump-debug-map, ?), and delete them.

@VSadov
Copy link
Member Author

VSadov commented Jun 14, 2023

I also see 3 warnings in the build output like:

EXEC : warning : DWARF unit from offset 0x00000000 incl. to offset 0x00133e38 excl. tries to read DIEs at offset 0x00133e38 [/Users/vs/aot01/runtime/src/tests/nativeaot/SmokeTests/UnitTests/UnitTests.csproj] [/Users/vs/aot01/runtime/src/tests/build.proj]
EXEC : warning : DWARF unit from offset 0x00000000 incl. to offset 0x001ce1c2 excl. tries to read DIEs at offset 0x001ce1c2 [/Users/vs/aot01/runtime/src/tests/nativeaot/SmokeTests/Reflection/Reflection_FromUsage.csproj] [/Users/vs/aot01/runtime/src/tests/build.proj]
EXEC : warning : DWARF unit from offset 0x00000000 incl. to offset 0x00129592 excl. tries to read DIEs at offset 0x00129592 [/Users/vs/aot01/runtime/src/tests/nativeaot/SmokeTests/Reflection/Reflection.csproj] [/Users/vs/aot01/runtime/src/tests/build.proj]

That matches the 3 runaway dsymutil processes, so it could be related.

The warnings are not there when reverting to the original libobjwriter.dylib from February 6.

@VSadov
Copy link
Member Author

VSadov commented Jun 14, 2023

if I build objectwriter/12.x branch instead, the binary size is much closer to what we have in the nuget - roughly the same 5Mb, not 7Mb that a build from main produces.

If I use the binary built from objectwriter/12.x branch, the smoke tests can build and pass.
It appears the main branch may have some macos-specific issues, at least on my machine, that need to be cleared up before doing anything for macos.

I think I will limit the initial threadstatic changes to Windows and Linux. Probably just x64 as that is enough to discuss and settle on the runtime changes and these combinations are the easiest to test. Arm64 and macos combinations can come later as separate changes.

@filipnavara
Copy link
Member

Submitted #433 to fix the DWARF issue that was causing the hang.

I've used ./build.sh --ci --restore --build --pack --arch arm64 --configuration Release /p:ClangTarget=aarch64-apple-darwin with or without --ci.
That is what the lab builds seem to use.

Pro-tip: If you are interested only in ObjWriter you can build with /p:BuildObjWriterOnly=true. Cuts a few hours from the build time ;-)

@VSadov VSadov force-pushed the tls branch 2 times, most recently from caab679 to 7ef2397 Compare June 18, 2023 14:23
@VSadov
Copy link
Member Author

VSadov commented Jun 19, 2023

I will make a separate PR for this.

@VSadov VSadov closed this Jun 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants