Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to fix missing libc++.so.1 errors in arm64 llvmaot perf run #88705

Merged
merged 16 commits into from
Aug 8, 2023

Conversation

directhex
Copy link
Contributor

No description provided.

@ghost
Copy link

ghost commented Jul 11, 2023

Tagging subscribers to this area: @hoyosjs
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: directhex
Assignees: directhex
Labels:

area-Infrastructure-coreclr

Milestone: -

@ghost
Copy link

ghost commented Jul 13, 2023

Tagging subscribers to this area: @directhex
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: directhex
Assignees: directhex
Labels:

area-Infrastructure-mono

Milestone: -

@directhex
Copy link
Contributor Author

I think this passed on internal CI? @LoopedBard3 @DrewScoggins @caaavik-msft if y'all are happy, plz approve and merge

@LoopedBard3
Copy link
Member

Looking back into the internal run, there are a lot of errors in the format of:

[2023/07/12 06:06:26][INFO]   * Assertion at /__w/1/s/src/mono/mono/metadata/assembly.c:2718, condition `corlib' not met
[2023/07/12 06:06:26][INFO] /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/BenchmarkDotNet.Autogenerated.csproj(55,4): error : Precompiling failed for /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Microsoft.CodeAnalysis.CSharp.dll
[2023/07/12 06:06:26][INFO]   [Microsoft.Extensions.Primitives.dll] Exec (with response file contents expanded) in /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_PATH=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_ENV_OPTIONS= /home/helixbot/work/A85808FA/p/monoaot/mono-aot-cross --debug --llvm "--aot=mcpu=native,nodebug,llvm-path=/home/helixbot/work/A85808FA/p/monoaot/pack/runtimes/linux-arm64/native,outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Microsoft.Extensions.Primitives.dll.so,llvm-outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Microsoft.Extensions.Primitives.dll-llvm.o" "Microsoft.Extensions.Primitives.dll"
[2023/07/12 06:06:26][INFO] 
[2023/07/12 06:06:26][INFO]   * Assertion at /__w/1/s/src/mono/mono/metadata/assembly.c:2718, condition `corlib' not met
[2023/07/12 06:06:26][INFO] /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/BenchmarkDotNet.Autogenerated.csproj(55,4): error : Precompiling failed for /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Microsoft.Extensions.Primitives.dll
[2023/07/12 06:06:26][INFO]   [Jil.dll] Exec (with response file contents expanded) in /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_PATH=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_ENV_OPTIONS= /home/helixbot/work/A85808FA/p/monoaot/mono-aot-cross --debug --llvm "--aot=mcpu=native,nodebug,llvm-path=/home/helixbot/work/A85808FA/p/monoaot/pack/runtimes/linux-arm64/native,outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Jil.dll.so,llvm-outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Jil.dll-llvm.o" "Jil.dll"
[2023/07/12 06:06:26][INFO] 
[2023/07/12 06:06:26][INFO]   * Assertion at /__w/1/s/src/mono/mono/metadata/assembly.c:2718, condition `corlib' not met
[2023/07/12 06:06:26][INFO] /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/BenchmarkDotNet.Autogenerated.csproj(55,4): error : Precompiling failed for /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Jil.dll
[2023/07/12 06:06:26][INFO]   [Microsoft.Extensions.DependencyInjection.Abstractions.dll] Exec (with response file contents expanded) in /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_PATH=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_ENV_OPTIONS= /home/helixbot/work/A85808FA/p/monoaot/mono-aot-cross --debug --llvm "--aot=mcpu=native,nodebug,llvm-path=/home/helixbot/work/A85808FA/p/monoaot/pack/runtimes/linux-arm64/native,outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Microsoft.Extensions.DependencyInjection.Abstractions.dll.so,llvm-outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Microsoft.Extensions.DependencyInjection.Abstractions.dll-llvm.o" "Microsoft.Extensions.DependencyInjection.Abstractions.dll"
[2023/07/12 06:06:26][INFO] 
[2023/07/12 06:06:26][INFO]   * Assertion at /__w/1/s/src/mono/mono/metadata/assembly.c:2718, condition `corlib' not met
[2023/07/12 06:06:26][INFO]   * Assertion at /__w/1/s/src/mono/mono/metadata/assembly.c:2718, condition `corlib' not met
[2023/07/12 06:06:26][INFO]   [Microsoft.Extensions.Logging.dll] Exec (with response file contents expanded) in /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_PATH=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_ENV_OPTIONS= /home/helixbot/work/A85808FA/p/monoaot/mono-aot-cross --debug --llvm "--aot=mcpu=native,nodebug,llvm-path=/home/helixbot/work/A85808FA/p/monoaot/pack/runtimes/linux-arm64/native,outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Microsoft.Extensions.Logging.dll.so,llvm-outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Microsoft.Extensions.Logging.dll-llvm.o" "Microsoft.Extensions.Logging.dll"
[2023/07/12 06:06:26][INFO] 
[2023/07/12 06:06:26][INFO]   * Assertion at /__w/1/s/src/mono/mono/metadata/assembly.c:2718, condition `corlib' not met
[2023/07/12 06:06:26][INFO]   [Sigil.dll] Exec (with response file contents expanded) in /home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_PATH=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish: MONO_ENV_OPTIONS= /home/helixbot/work/A85808FA/p/monoaot/mono-aot-cross --debug --llvm "--aot=mcpu=native,nodebug,llvm-path=/home/helixbot/work/A85808FA/p/monoaot/pack/runtimes/linux-arm64/native,outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Sigil.dll.so,llvm-outfile=/home/helixbot/work/A85808FA/w/B3DA09AD/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/fc6a33d4-e06e-4890-89a9-265da2761a8f/bin/net8.0/linux-arm64/publish/Sigil.dll-llvm.o" "Sigil.dll"

Do you know what the impact of this is on the rest of the testing and do you happen to know a good fix? Thanks!

@directhex
Copy link
Contributor Author

I think generally "condition 'corlib' not met" means the corlib wasn't found in the searched locations. I'm not sure where it's being copied too & searched for. Do you want me to try & figure a fix for that here, or in another PR?

@LoopedBard3
Copy link
Member

Either works, if kept in this PR we could rename the PR to something like "Update MonoAOT perf flow".

@directhex
Copy link
Contributor Author

OK. What's the minimally complex way I can reproduce the errors locally? How can I get a folder which looks like what's trying to run on Helix and a copy of the command we're trying to execute on there?

@directhex
Copy link
Contributor Author

directhex commented Jul 26, 2023

@LoopedBard3 I think other changes in the interim have dealt with the missing corlib issue - either that or my local repro is doing entirely the wrong thing. I was able to run LLVM tests on an Ubuntu VM on my macbook. I had other issues (hostpolicy and hostfxr were the wrong arch, opt didn't have the expected libc++ in its folder) but I think that's problems with my repro build, not the test running.

Can we determine where we stand, with this PR in as-is?

@LoopedBard3
Copy link
Member

LoopedBard3 commented Jul 26, 2023

I will do another run so we can see if the issue has been fixed in the CI. @directhex The last change that I think is needed for the PR is to add the same monoAOT build args changes to the perf-non-wasm-jobs.yml.

@LoopedBard3
Copy link
Member

@directhex
Copy link
Contributor Author

FWIW, you can try #88917 as a fix for the browser-wasm failing job

@kotlarmilos
Copy link
Member

@LoopedBard3 I think other changes in the interim have dealt with the missing corlib issue - either that or my local repro is doing entirely the wrong thing. I was able to run LLVM tests on an Ubuntu VM on my macbook. I had other issues (hostpolicy and hostfxr were the wrong arch, opt didn't have the expected libc++ in its folder) but I think that's problems with my repro build, not the test running.

Can we determine where we stand, with this PR in as-is?

The corelib issue has been fixed, check discussion in dotnet/BenchmarkDotNet#2311. Currently, there is another issue with nuget package versions reported in dotnet/performance#3164.

@directhex
Copy link
Contributor Author

I will do another run so we can see if the issue has been fixed in the CI. @directhex The last change that I think is needed for the PR is to add the same monoAOT build args changes to the perf-non-wasm-jobs.yml.

I don't see any cases in perf-non-wasm-jobs.yml which are clearly wrong (just a lot of things using defaults). Are there further failures happening, in that set of pipelines?

@kotlarmilos
Copy link
Member

I think this issue is linux-arm64 specific which is in:

# run mono aot microbenchmarks perf job
- template: /eng/pipelines/common/platform-matrix.yml
parameters:
jobTemplate: /eng/pipelines/coreclr/templates/perf-job.yml # NOTE: should we move this file out of coreclr tempelates because it contains mono jobs?
buildConfig: release
runtimeFlavor: aot
platforms:
- linux_arm64
jobParameters:
testGroup: perf
liveLibrariesBuildConfig: Release
runtimeType: mono
codeGenType: 'AOT'
projectFile: microbenchmarks.proj
runKind: micro_mono
runJobTemplate: /eng/pipelines/coreclr/templates/run-performance-job.yml
logicalmachine: 'perfampere'
timeoutInMinutes: 780

The perf-non-wasm-jobs.yml jobs were working properly before the nuget package versions issue started occurring.

@LoopedBard3
Copy link
Member

I will do another run so we can see if the issue has been fixed in the CI. @directhex The last change that I think is needed for the PR is to add the same monoAOT build args changes to the perf-non-wasm-jobs.yml.

I don't see any cases in perf-non-wasm-jobs.yml which are clearly wrong (just a lot of things using defaults). Are there further failures happening, in that set of pipelines?

I will do a test run to verify. I noticed that some changes were made to perf_job.yml and what not. We use perf_job.yml for both sets perf_slow and perf-non-wasm-jobs, so if these changes are compatible with how we are already building in perf-non-wasm-jobs we are good.

@SamMonoRT SamMonoRT requested a review from kotlarmilos July 28, 2023 19:10
@LoopedBard3
Copy link
Member

LoopedBard3 commented Jul 28, 2023

There are some issues from my test run (https://dev.azure.com/dnceng/internal/_build/results?buildId=2230671&view=results) of the non-wasm jobs:

The error may have been made by using the same perf_slow build args for the non-wasm jobs run. Trying a run with the currently working Wasm AOT Build args.

@SamMonoRT
Copy link
Member

The Mono AOT LLVM arm64 perf lanes are broken due to this. What else needs to get this merged ?

@marek-safar marek-safar added this to the 8.0.0 milestone Aug 1, 2023
@akoeplinger
Copy link
Member

@LoopedBard3 the Helix queue seems to be pretty backed up so the internal runs will keep timing out, can we clear the queue?
image

@LoopedBard3
Copy link
Member

@LoopedBard3 the Helix queue seems to be pretty backed up so the internal runs will keep timing out, can we clear the queue? image

@DrewScoggins @cincuranet Any thoughts?

@cincuranet
Copy link
Contributor

@LoopedBard3 I don't know whether we have knobs to clear the queue.

@akoeplinger
Copy link
Member

@cincuranet you can ask dnceng on the First Responders channel, they have the knobs.

@directhex directhex requested a review from radical as a code owner August 7, 2023 15:04
@directhex
Copy link
Contributor Author

OK, x64 seems to be good. I think there's still a failure condition in arm64 (opt doesn't have libc++)

@directhex directhex requested a review from marek-safar as a code owner August 7, 2023 18:06
@directhex
Copy link
Contributor Author

I think I'm going to try and eliminate MonoBundleLLVMOptimizer and MonoAOTBundleLLVMOptimizer entirely. It's such an odd switch to include. Either you want a functional LLVM AOT (with opt/llc included) or you don't.

@directhex
Copy link
Contributor Author

I think I'm going to try and eliminate MonoBundleLLVMOptimizer and MonoAOTBundleLLVMOptimizer entirely.

I'm going to do this IN ANOTHER PR, since I've actually been making progress with fixing issues in this PR with the existing properties.

@directhex
Copy link
Contributor Author

I'm declaring this "good enough for now"

X64 https://dev.azure.com/dnceng/internal/_build/results?buildId=2238462&view=results
ARM64 https://dev.azure.com/dnceng/internal/_build/results?buildId=2238442&view=results

ARM64 run isn't finished, but has at least some tests passing already so unblocks the failures:

[2023/08/08 09:08:38][INFO]   [223/223] Microsoft.CodeAnalysis.CSharp.dll -> Microsoft.CodeAnalysis.CSharp.dll.so, Microsoft.CodeAnalysis.CSharp.dll-llvm.o
[2023/08/08 09:08:38][INFO]   CompiledAssemblies:
[2023/08/08 09:08:38][INFO] Build succeeded.
[2023/08/08 09:08:38][INFO] CSC : warning CS8002: Referenced assembly 'MicroBenchmarks, Version=42.42.42.42, Culture=neutral, PublicKeyToken=null' does not have a strong name. [/home/helixbot/work/9919088A/w/AB9C0942/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/Job-OZXEIH/BenchmarkDotNet.Autogenerated.csproj]
[2023/08/08 09:08:38][INFO]     1 Warning(s)
[2023/08/08 09:08:38][INFO]     0 Error(s)
[2023/08/08 09:08:38][INFO] Time Elapsed 00:02:59.15

@directhex directhex merged commit 75ee623 into dotnet:main Aug 8, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Sep 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants