Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output OMP threading info to screen [ww3_shel, ww3_multi] #1191

Merged
merged 3 commits into from
Mar 11, 2024

Conversation

ukmo-ccbunney
Copy link
Collaborator

@ukmo-ccbunney ukmo-ccbunney commented Feb 21, 2024

Pull Request Summary

Added screen output in ww3_shel and ww3_multi showing number of (available) threads when OMP enabled.
Also includes a fix to avoid truncation of the build.log file (see: ukmo-waves#46)

Description

When running an OMP enabled version of ww3_shel or ww3_multi (i.e. compiled with OMPG), the program will output a message to the screen stating that OMP is enabled and the number of threads available to the parallel regions.

Note, in MPI/OMP hybrid mode only the "screen" processor will output this message, so the number of OMP threads will be relevant to that processor. I imagine that most people are not running a different number of threads per processor, but that is something to bear in mind!

No change to the regtests outputs are expected, other than the new line in the screen output for ww3_shel and ww3_multi in the OMP/OMPH regtests.

Issue(s) addressed

Fixes: #1032

Commit Message

Output OMP threading info to screen when running ww3_shel/ww3_multi compiled with the OMPG switch.
Also fixes truncation of build.log when running run_cmake_build.

Check list

Testing

  • How were these changes tested? Regtests.
  • Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Yes.
  • Have the matrix regression tests been run (if yes, please note HPC and compiler)? Cray XC; GNU Compiler.
  • Please indicate the expected changes in the regression test output, (Note the list of known non-identical tests.) Only expected change is a new output line in the ww3_shel/ww3_multi screen output for OMP/OMPH regtests.
  • Please provide the summary output of matrix.comp (matrix.Diff.txt, matrixCompFull.txt and matrixCompSummary.txt):

Regtest results

**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_02/./work_PR3_UNO_MPI_c_c                     (3 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2                     (17 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (17 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2                     (16 files differ)
mww3_test_03/./work_PR1_MPI_d2                     (15 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2                     (14 files differ)
mww3_test_03/./work_PR3_UQ_MPI_e                     (1 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (18 files differ)
mww3_test_03/./work_PR2_UQ_MPI_d2                     (17 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e_c                     (1 files differ)

There are some differences in the ww3_shel.out and ww3_multi.out files for OMP/OMPH tests, but these differences are filtered out by the matrix.comp script. The differences are all like this:

9,10d8
<   OMP threading enabled. Number of threads:   4
< 

@ukmo-ccbunney ukmo-ccbunney marked this pull request as draft February 21, 2024 17:14
@ukmo-ccbunney ukmo-ccbunney self-assigned this Feb 21, 2024
@ukmo-ccbunney ukmo-ccbunney added the enhancement New feature or request label Feb 21, 2024
@JessicaMeixner-NOAA
Copy link
Collaborator

Thanks for sharing this @ukmo-ccbunney ! I'll run some tests on my end with this.

@ukmo-ccbunney
Copy link
Collaborator Author

Update with regtest results.
One thing I note is that the new output regarding the number of OMP threads is filtered our of ww3_shel.out and ww3_multi.out when running a comparison with matrix,.comp. This is unfortunate as it would be good to capture this in the regrest comparison to catch cases where we might accidentally change the OMP support. I'll have a look and see if the filtering can be amended at all...

@JessicaMeixner-NOAA
Copy link
Collaborator

@ukmo-ccbunney I was finally able to run some tests with this, and I do at least get the expected number of threads that I request, although I'm not seeing any impact on model performance using various thread counts...

@ukmo-ccbunney
Copy link
Collaborator Author

@ukmo-ccbunney I was finally able to run some tests with this, and I do at least get the expected number of threads that I request, although I'm not seeing any impact on model performance using various thread counts...

How strange! Do you have another machine you can test on? I sometimes run stuff on my linux desktop to check non-MPI code (including OMP runs).

@ukmo-ccbunney ukmo-ccbunney marked this pull request as ready for review February 28, 2024 13:21
@JessicaMeixner-NOAA
Copy link
Collaborator

I do, but it's on maintenance today... I'll try that tomorrow!

@JessicaMeixner-NOAA
Copy link
Collaborator

@ukmo-ccbunney I made a PR to your branch because I was running this and realized I wasn't getting all the build information to the log file that I wanted to see. It's here: ukmo-waves#46 I can put it in a new PR if you'd like but I was on this branch so just pushed here.

The good news is my tests this afternoon are finally showing OMP speed-up closer to what's expected so maybe the other day was just an off day... I don't know. But this is a nice update to WW3 for a sanity check.

@ukmo-ccbunney
Copy link
Collaborator Author

@ukmo-ccbunney I made a PR to your branch because I was running this and realized I wasn't getting all the build information to the log file that I wanted to see. It's here: ukmo-waves#46 I can put it in a new PR if you'd like but I was on this branch so just pushed here.

Thanks - that's a really useful update. I've merged it into this branch.

The good news is my tests this afternoon are finally showing OMP speed-up closer to what's expected so maybe the other day was just an off day... I don't know. But this is a nice update to WW3 for a sanity check.

That's good news. Odd how you were experiencing such a lot of variability. It almost sounds like CPU over-subscription on your HPC? Is there a chance that other jobs are grabbing resources from a node you are running on? Also, it could be a result of file system contention too (although our WW3 files are not huge...) When our HPC is running it's weekly RAID check the I/O throughput is degraded and I can see an impact on model runtimes. Just a thought!

Perhaps it's just gremlins...

@JessicaMeixner-NOAA
Copy link
Collaborator

Would not be surprised by the gremlins!!! I have no idea what is going on to be honest but am happy to see what I was expecting finally. File system or other things would not surprise me at all...

@MatthewMasarik-NOAA
Copy link
Collaborator

@ukmo-ccbunney @JessicaMeixner-NOAA

I wanted to see if either of you are still looking at this, or is it ready for final review?

@ukmo-ccbunney
Copy link
Collaborator Author

Hi @MatthewMasarik-NOAA
It's all ready to go as far as I am concerned.
I'll update the description as this PR now also includes Jessica's ukmo-waves/#46 PR too.

@MatthewMasarik-NOAA
Copy link
Collaborator

Hi @MatthewMasarik-NOAA It's all ready to go as far as I am concerned. I'll update the description as this PR now also includes Jessica's ukmo-waves/#46 PR too.

Awesome, thanks @ukmo-ccbunney, I'll get the tests going!

@MatthewMasarik-NOAA MatthewMasarik-NOAA self-requested a review March 11, 2024 12:43
Copy link
Collaborator

@MatthewMasarik-NOAA MatthewMasarik-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review
Pass

Testing
Pass

**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_03/./work_PR3_UQ_MPI_e_c                     (1 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e                     (1 files differ)
mww3_test_03/./work_PR2_UQ_MPI_e                     (1 files differ)
mww3_test_03/./work_PR2_UNO_MPI_e                     (1 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2                     (16 files differ)
mww3_test_03/./work_PR1_MPI_d2                     (10 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (16 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (15 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2                     (12 files differ)
mww3_test_03/./work_PR2_UQ_MPI_d2                     (15 files differ)
mww3_test_03/./work_PR3_UQ_MPI_e                     (1 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e_c                     (1 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2                     (16 files differ)
mww3_test_09/./work_MPI_ASCII                     (0 files differ)
ww3_tp2.10/./work_MPI_OMPH                     (7 files differ)
ww3_tp2.16/./work_MPI_OMPH                     (4 files differ)
ww3_tp2.6/./work_ST4_ASCII                     (0 files differ)
ww3_ufs1.3/./work_a                     (3 files differ)
**********************************************************************
************************ identical cases *****************************
**********************************************************************

Approved.

@MatthewMasarik-NOAA MatthewMasarik-NOAA merged commit 156a46d into NOAA-EMC:develop Mar 11, 2024
20 checks passed
@MatthewMasarik-NOAA
Copy link
Collaborator

Thank you @ukmo-ccbunney and @JessicaMeixner-NOAA, having the OMP thread count and full build log output will be very useful for troubleshooting build and runtime issues.

@ukmo-ccbunney ukmo-ccbunney deleted the feature/omp_status branch March 12, 2024 08:52
miguelsolanocordoba added a commit to wavespotter/WW3 that referenced this pull request Apr 19, 2024
* Bugfix - initialised VD and VS to zero in w3srcemd. (NOAA-EMC#1037)

* More efficient test for binary files in matrix.comp (NOAA-EMC#1035)

* Tidy up of pre-processor directives and unused variables in w3srcemd.F90 (NOAA-EMC#1010)

* Correct typo in w3srcemd.F90 pre-processor directive. (NOAA-EMC#1039)

* minor bugfix for matrix grepping on keywords (NOAA-EMC#1049)

* Stop masking group 1 output where icec > icen (NOAA-EMC#1019)

* Doxygen documentation added, 8th subset.(NOAA-EMC#1046)

* NC4 ,F90 ,XX0 switches removed from ww3_tp2.19 regtest (NOAA-EMC#1054)

* CI:  Fix for Intel scripts. GNU scripts updated. (NOAA-EMC#1064)

* correct the computation of QP parameter, add QKK output parameter, change UST scale factor (NOAA-EMC#1050)

* correct issue with ww3_multi when requesting restart2 and using nml file instead of inp file (NOAA-EMC#1070)

* correct calendar for track netcdf output (NOAA-EMC#1079)

* Fix missing mod_def.ww3 file in multigrid regression tests for track output (NOAA-EMC#1091)

* STAB3: fix cmake build for ST4 or ST3 (NOAA-EMC#1086)

* new feature to output out_grd.ww3, out_pnt.ww3 and mod_def.ww3 both in binary and ascii format using switch ASCII. (NOAA-EMC#1089)

* Update local unit number arrays (NDS, MDS) to be same size of array defined in w3odatmd (size=15). Also, defined unit numbers for NDS(14) and NDS(15). (NOAA-EMC#1098)

* Removed code referencing PHIOC in output section for PHICE in ww3_ounf (NOAA-EMC#1093)

* implementation of the GQM (Gaussian Quadrature Method) to replace the DIA in NL1 or NL2. (NOAA-EMC#1083)

* update logic to ensure you are not accessing uninitialized dates (NOAA-EMC#1114)

* Initialised S and D arrays in W3SDB1 before potential early return if zero energy. (NOAA-EMC#1115)

* ww3_ounp.F90:  x/y units attribute corrected from 'm' to 'km' (NOAA-EMC#1088)

* Bugfix: Assign unit numbers to ASCII gridded/point output in multi-grid mode. (NOAA-EMC#1118)

* correct bugs to run correctly GQM implementation (NOAA-EMC#1127)

* Adding documentation to w3iopo() in preparation for code for NOAA-EMC#682. (NOAA-EMC#1131)

* NCEP regtest module updates: uses spack-stack/1.5.0, includes scotch/7.0.4 (NOAA-EMC#1137)

* Minor update to ncep regtests (NOAA-EMC#1138)

* Updated intel workflow to install oneapi compilers from new location. (NOAA-EMC#1157)

* Add unit test for points I/O code. (NOAA-EMC#1158)

* Update Intel CI (relocate /usr/local; ensure intel-oneapi-mpi; use ubuntu-latest) (NOAA-EMC#1161)

* remove lookup table for ST4 to speed up computation and clean up the ST4 code (NOAA-EMC#1124)

Co-authored-by: Fabrice Ardhuin <[email protected]>

* initialize USSP_WN for mod_def (NOAA-EMC#1165)

* Introduce IC4M8 and IC4M9 to WW3 (NOAA-EMC#1176)

* clean up and add ST4 variables (NOAA-EMC#1181)

* w3fld1md.F90: fix divide by zero in CRIT2 parameter (NOAA-EMC#1184)

* ww3_prnc.F90: fix out-of-scope grid index write statement (NOAA-EMC#1185)

* Bugfix: address potential divide-by-zero in APPENDTAIL (NOAA-EMC#1188)

Co-authored-by: Denise Worthen <[email protected]>

* Provide initial drying of cells with depth < ZLIM for SMC grid. (NOAA-EMC#1192)

* Output OMP threading info to screen when running ww3_shel/ww3_multi compiled with the OMPG switch. Also fixes truncation of build.log when running run_cmake_build. (NOAA-EMC#1191)

* Added screen output showing number of threads when OMP enabled.

* update build to get more info in logs (NOAA-EMC#46)

---------

Co-authored-by: Jessica Meixner <[email protected]>

* update run_cmake_test to catch build errors and exit (NOAA-EMC#1194)

* fix merge conflicts

* Fix gustiness bug, as suggst by Pieter

* Change USTARsigma to WAM implementation

---------

Co-authored-by: Chris Bunney <[email protected]>
Co-authored-by: Mickael Accensi <[email protected]>
Co-authored-by: Benoit Pouliot <[email protected]>
Co-authored-by: Matthew Masarik <[email protected]>
Co-authored-by: Ghazal-Mohammadpour <[email protected]>
Co-authored-by: Jessica Meixner <[email protected]>
Co-authored-by: Biao Zhao <[email protected]>
Co-authored-by: Edward Hartnett <[email protected]>
Co-authored-by: Alex Richert <[email protected]>
Co-authored-by: Fabrice Ardhuin <[email protected]>
Co-authored-by: W. Erick Rogers <[email protected]>
Co-authored-by: Denise Worthen <[email protected]>
Co-authored-by: Camille Teicheira <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regtest cmake build.log overwritten
3 participants