-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCC 11.4; CUDA 11.8; cleanup for cmsplatf/cmsos use #8545
Conversation
A new Pull Request was created by @smuzaffar (Malik Shahzad Muzaffar) for branch IB/CMSSW_13_2_X/master. @cmsbuild, @smuzaffar, @aandvalenzuela, @iarspider can you please review it and eventually sign? Thanks. |
test parameters:
|
please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-81a383/33147/summary.html The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Comparison SummarySummary:
|
Please test |
I'm OK with the changes. I had already tested CUDA 11.8.0 standalone, and the performance was the same as 12.0 and 12.1: they have a slightly worse performance than 11.5 in the standalone test, but the CMSSW measurement with 12.x did not show any impact, so it shouldn't be a problem and from that point of view we can go ahead. Since the main update in 11.8.0 is the support for the latest GPU generations, I think we should update the Lines 4 to 12 in 97b1049
I would simplify it to # build support for Pascal, Volta, Turing, Ampere, Lovelace and Hopper
%define cuda_arch 60 70 75 80 86 89 90 This list targets The main drawback of building for more architectures is that it will take longer and produce slightly larger libraries. If that's a problem, a more minimal list could be: # build support for Pascal, Volta, Turing, Ampere, and Hopper
%define cuda_arch 60 70 75 80 90 |
-1 Failed Tests: UnitTests The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Unit TestsI found errors in the following unit tests: ---> test test-das-selected-lumis had ERRORS ---> test test_edmPickEvents had ERRORS Comparison SummarySummary:
GPU Comparison SummarySummary:
|
Just to be explicit, I'm ok with the changes as well. |
Pull request #8545 was updated. |
-1 Failed Tests: Build The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: BuildI found compilation error when building: >> Cuda Device Link tmp/el8_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuOneHistoContainer_t/gpuOneHistoContainer_t_cudadlink.o >> Building binary gpuOneHistoContainer_t Copying tmp/el8_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuOneHistoContainer_t/gpuOneHistoContainer_t to productstore area: >> Compiling /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_13_2_X_2023-06-18-2300/src/HeterogeneousCore/CUDAUtilities/test/OneToManyAssoc_t.cu nvcc error : 'ptxas' died due to signal 9 (Kill signal) gmake: *** [tmp/el8_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuOneToManyAssocRT_debug/OneToManyAssoc_t.cu.o] Error 1 >> Cuda Device Link tmp/el8_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuOneToManyAssocRT_debug/gpuOneToManyAssocRT_debug_cudadlink.o nvlink fatal : Could not open input file 'tmp/el8_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuOneToManyAssocRT_debug/OneToManyAssoc_t.cu.o' (target: sm_60) gmake: *** [tmp/el8_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuOneToManyAssocRT_debug/gpuOneToManyAssocRT_debug_cudadlink.o] Error 1 >> Building binary gpuOneToManyAssocRT_debug /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/x86_64-redhat-linux-gnu/11.4.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: cannot find tmp/el8_amd64_gcc11/src/HeterogeneousCore/CUDAUtilities/test/gpuOneToManyAssocRT_debug/OneToManyAssoc_t.cu.o: No such file or directory |
please test |
Pull request #8545 was updated. |
Please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-81a383/33252/summary.html Comparison SummarySummary:
GPU Comparison SummarySummary:
|
mostly in message logger but there are couple of errors for HLT https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_13_2_X_2023-06-19-2300+81a383/57586/12434.0_TTbar_14TeV+2023/
mostly due to message logger but there are few in wfs 4.53 and 9.0 which are non-message logger failure. |
+externals |
This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_13_2_X/master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
Those differences look compatible with random differences reported in cms-sw/cmssw#41200 |
@perrotta @rappoccio this is good to go in. It includes GCC minor version update and cuda 11.8 . Note that due to GCC changes it rebuilts all the externals. I would like to get this in 13.2.X IBs now so that we have couple of weeks of IBs before we cut last open pre-release |
+1 |
cmsplatf
andcmsos