Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/nml #1070

Merged
merged 6 commits into from
Sep 21, 2023
Merged

Bugfix/nml #1070

merged 6 commits into from
Sep 21, 2023

Conversation

mickaelaccensi
Copy link
Collaborator

@mickaelaccensi mickaelaccensi commented Sep 8, 2023

Pull Request Summary

correct issue with ww3_multi when requesting restart2 and using nml file instead of inp file

Description

the bug was introduced when merging the branch NCEP GFSv16/GEFSv12 #140 in 13 Jan 2020. No regtest was added in matrix to test it.

The PR propose to :
-correct issue for ww3_multi when outputting restart2 and using nml instead of inp. see #1062
-add scotch lib in matrix_cmake_datarmor
-force CFX, CFD, CFK to be outputted as type REAL if NCVARTYPE is set at 2 (=depends) for netCDF output
-add test on ufs switch to enable corresponding regtests
-correct typo in ww3_ufs1.1/namelist*.nml
-correct namelist in ww3_tp2.3 to enable TH1M and STH1M outputs
-replace 'echo' by 'basename' in matrix_cmake_datarmor to restrict the search of keyword in the filename itself

Issue(s) addressed

Commit Message

correct issue with ww3_multi when requesting restart2 and using nml file instead of inp file

Check list

Testing

  • How were these changes tested? matrix
  • Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) ww3_tp2.21
  • Have the matrix regression tests been run (if yes, please note HPC and compiler)? OK
  • Please indicate the expected changes in the regression test output, (Note the list of known non-identical tests.)
  • ww3_tp2.21 : now it runs with nml
  • ww3_tp2.3 : now it runs TH1MF and STH1MF activated
  • ww3_tp2.14 : print differences (duplicated in develop branch)
  • ww3_tp2.6 : differences due to truncated variable with scale_factor (corrected in PR QP #1050 )
  • Please provide the summary output of matrix.comp (matrix.Diff.txt, matrixCompFull.txt and matrixCompSummary.txt):

matrixCompFull.txt
matrixCompSummary.txt
matrixDiff.txt

@JessicaMeixner-NOAA
Copy link
Collaborator

@mickaelaccensi - Thanks for making this fix. Adding the regression test has been on my plate for a while - Hopefully that will come sooner than later, but we'll see. Can we add to the commit message a note about the netcdf update too? It seems a little tangential to the stated update as is.

-correct namelist in ww3_tp2.3 to enable TH1M and STH1M outputs
-replace 'echo' by 'basename' in matrix_cmake_datarmor
 to restrict the search of keyword in the filename itself
@mickaelaccensi
Copy link
Collaborator Author

@mickaelaccensi - Thanks for making this fix. Adding the regression test has been on my plate for a while - Hopefully that will come sooner than later, but we'll see. Can we add to the commit message a note about the netcdf update too? It seems a little tangential to the stated update as is.

is it clear enough now ? Let me know if you want me to modify the commit message

@MatthewMasarik-NOAA MatthewMasarik-NOAA mentioned this pull request Sep 18, 2023
3 tasks
@MatthewMasarik-NOAA
Copy link
Collaborator

Hi @mickaelaccensi, I ran the matrix and the PR branch run reached the wallclock limit in matrix03.

I believe the issue is in this run test call

Running now options: run_test -b slurm -o all -S -T -s PR3_UNO_MPI_SCRIP -w work_PR3_UNO_MPI_c_c -m grdset_c -g curv -f -p srun -n 24 ../model mww3_test_02

It seems to have had some sort of internal error shown here, but then seems to have got passed it, but then timed out

SI nbPlus=       14884  nbMinus=           0  nbZero=           0
 SI nbPlus=        7396  nbMinus=           0  nbZero=           0
 SI nbPlus=        7396  nbMinus=           0  nbZero=           0
  
 Grid 1 size        3721
 Grid 2 size        1849
 
 grid1 sweep
 grid2 sweep 
 integration stalled: num_subseg exceeded limit
 Cell         1625
 Edge            2
 Grid            1
 Fraction of segment left   0.599999988869029     
 integration stalled: num_subseg exceeded limit
 Cell         1625
 Edge            2
 Grid            1
 Fraction of segment left   0.599999988869029     
 integration stalled: num_subseg exceeded limit
        .
        .

I attached the full matrix03.out file: matrix03.out.txt

@mickaelaccensi
Copy link
Collaborator Author

@MatthewMasarik-NOAA this issue is also present in the develop branch, actually it seems that all the regtests for mww3_test02 with options -m grdset_c and -g curv are crashing. This bug should be solved in another PR since it is not related to my branch. Does it sounds ok for you ?

@MatthewMasarik-NOAA
Copy link
Collaborator

Hi @mickaelaccensi, none of my develop runs are crashing. Are they for you?

I just noticed though that this branch is out of date. It's possible updating could resolve the issues. Please sync up and I'll re-run the matrix. Thanks

@mickaelaccensi
Copy link
Collaborator Author

I've updated my branch with the develop. I thought it was done but I may missed the last one.

And yes I have the same issue as you with my up-to-date develop branch. Have checked that you have the postprocessing done for all the mww3_test02 work directories ? Especially those with grdset_c and curv grid. Because in my case it does not crash with an explicit exit code, it just stop and do not create out_grd.* and so the post processing is not done

@MatthewMasarik-NOAA
Copy link
Collaborator

Hi @mickaelaccensi, thanks for syncing. I'll try re-running both develop and the PR branch.

Nope, I didn't get an explicit crash either, just what was shown in the matrix03 out posted. We'll see what these re-runs look like.

Copy link
Collaborator

@MatthewMasarik-NOAA MatthewMasarik-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mickaelaccensi Good news, syncing resolved the issue in the new runs.

matrix output

**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
## known non-b4b
mww3_test_03/./work_PR1_MPI_e                     (1 files differ)
mww3_test_03/./work_PR2_UQ_MPI_e                     (1 files differ)
mww3_test_03/./work_PR2_UNO_MPI_e                     (1 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2                     (16 files differ)
mww3_test_03/./work_PR1_MPI_d2                     (14 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (12 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (15 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2                     (16 files differ)
mww3_test_03/./work_PR2_UQ_MPI_d2                     (14 files differ)
mww3_test_03/./work_PR3_UQ_MPI_e                     (1 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e_c                     (1 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2                     (16 files differ)
ww3_tp2.10/./work_MPI_OMPH                     (7 files differ)
ww3_tp2.16/./work_MPI_OMPH                     (4 files differ)
ww3_ufs1.3/./work_a                     (3 files differ)
 
ww3_tp2.3/./work_PR2_UNO                     (6 files differ)
ww3_tp2.3/./work_PR1_MPI                     (6 files differ)
ww3_tp2.3/./work_PR3_UNO_MPI                     (6 files differ)
ww3_tp2.3/./work_PR3_UQ                     (6 files differ)
ww3_tp2.3/./work_PR3_UNO                     (6 files differ)
ww3_tp2.3/./work_PR1                     (6 files differ)
ww3_tp2.3/./work_PR2_UQ                     (6 files differ)
ww3_tp2.3/./work_PR2_UQ_MPI                     (6 files differ)
ww3_tp2.3/./work_PR3_UQ_MPI                     (6 files differ)
ww3_tp2.3/./work_PR2_UNO_MPI                     (6 files differ)

**********************************************************************
************************ identical cases *****************************
**********************************************************************

The new expected diffs are limited to tests within ww3_tp2.3, each due to the following six files:

  log.ww3 
  mod_def.ww3 
  out_grd.ww3 
  ww3_grid.out
  ww3_ounf.out
  ww3_outf.out

Code review

PASS

Testing

PASS

@MatthewMasarik-NOAA
Copy link
Collaborator

Thanks @mickaelaccensi for this PR, addressing a number of corrections.

@MatthewMasarik-NOAA MatthewMasarik-NOAA merged commit 8589d12 into NOAA-EMC:develop Sep 21, 2023
@MatthewMasarik-NOAA MatthewMasarik-NOAA mentioned this pull request Sep 21, 2023
2 tasks
miguelsolanocordoba added a commit to wavespotter/WW3 that referenced this pull request Apr 19, 2024
* Bugfix - initialised VD and VS to zero in w3srcemd. (NOAA-EMC#1037)

* More efficient test for binary files in matrix.comp (NOAA-EMC#1035)

* Tidy up of pre-processor directives and unused variables in w3srcemd.F90 (NOAA-EMC#1010)

* Correct typo in w3srcemd.F90 pre-processor directive. (NOAA-EMC#1039)

* minor bugfix for matrix grepping on keywords (NOAA-EMC#1049)

* Stop masking group 1 output where icec > icen (NOAA-EMC#1019)

* Doxygen documentation added, 8th subset.(NOAA-EMC#1046)

* NC4 ,F90 ,XX0 switches removed from ww3_tp2.19 regtest (NOAA-EMC#1054)

* CI:  Fix for Intel scripts. GNU scripts updated. (NOAA-EMC#1064)

* correct the computation of QP parameter, add QKK output parameter, change UST scale factor (NOAA-EMC#1050)

* correct issue with ww3_multi when requesting restart2 and using nml file instead of inp file (NOAA-EMC#1070)

* correct calendar for track netcdf output (NOAA-EMC#1079)

* Fix missing mod_def.ww3 file in multigrid regression tests for track output (NOAA-EMC#1091)

* STAB3: fix cmake build for ST4 or ST3 (NOAA-EMC#1086)

* new feature to output out_grd.ww3, out_pnt.ww3 and mod_def.ww3 both in binary and ascii format using switch ASCII. (NOAA-EMC#1089)

* Update local unit number arrays (NDS, MDS) to be same size of array defined in w3odatmd (size=15). Also, defined unit numbers for NDS(14) and NDS(15). (NOAA-EMC#1098)

* Removed code referencing PHIOC in output section for PHICE in ww3_ounf (NOAA-EMC#1093)

* implementation of the GQM (Gaussian Quadrature Method) to replace the DIA in NL1 or NL2. (NOAA-EMC#1083)

* update logic to ensure you are not accessing uninitialized dates (NOAA-EMC#1114)

* Initialised S and D arrays in W3SDB1 before potential early return if zero energy. (NOAA-EMC#1115)

* ww3_ounp.F90:  x/y units attribute corrected from 'm' to 'km' (NOAA-EMC#1088)

* Bugfix: Assign unit numbers to ASCII gridded/point output in multi-grid mode. (NOAA-EMC#1118)

* correct bugs to run correctly GQM implementation (NOAA-EMC#1127)

* Adding documentation to w3iopo() in preparation for code for NOAA-EMC#682. (NOAA-EMC#1131)

* NCEP regtest module updates: uses spack-stack/1.5.0, includes scotch/7.0.4 (NOAA-EMC#1137)

* Minor update to ncep regtests (NOAA-EMC#1138)

* Updated intel workflow to install oneapi compilers from new location. (NOAA-EMC#1157)

* Add unit test for points I/O code. (NOAA-EMC#1158)

* Update Intel CI (relocate /usr/local; ensure intel-oneapi-mpi; use ubuntu-latest) (NOAA-EMC#1161)

* remove lookup table for ST4 to speed up computation and clean up the ST4 code (NOAA-EMC#1124)

Co-authored-by: Fabrice Ardhuin <[email protected]>

* initialize USSP_WN for mod_def (NOAA-EMC#1165)

* Introduce IC4M8 and IC4M9 to WW3 (NOAA-EMC#1176)

* clean up and add ST4 variables (NOAA-EMC#1181)

* w3fld1md.F90: fix divide by zero in CRIT2 parameter (NOAA-EMC#1184)

* ww3_prnc.F90: fix out-of-scope grid index write statement (NOAA-EMC#1185)

* Bugfix: address potential divide-by-zero in APPENDTAIL (NOAA-EMC#1188)

Co-authored-by: Denise Worthen <[email protected]>

* Provide initial drying of cells with depth < ZLIM for SMC grid. (NOAA-EMC#1192)

* Output OMP threading info to screen when running ww3_shel/ww3_multi compiled with the OMPG switch. Also fixes truncation of build.log when running run_cmake_build. (NOAA-EMC#1191)

* Added screen output showing number of threads when OMP enabled.

* update build to get more info in logs (NOAA-EMC#46)

---------

Co-authored-by: Jessica Meixner <[email protected]>

* update run_cmake_test to catch build errors and exit (NOAA-EMC#1194)

* fix merge conflicts

* Fix gustiness bug, as suggst by Pieter

* Change USTARsigma to WAM implementation

---------

Co-authored-by: Chris Bunney <[email protected]>
Co-authored-by: Mickael Accensi <[email protected]>
Co-authored-by: Benoit Pouliot <[email protected]>
Co-authored-by: Matthew Masarik <[email protected]>
Co-authored-by: Ghazal-Mohammadpour <[email protected]>
Co-authored-by: Jessica Meixner <[email protected]>
Co-authored-by: Biao Zhao <[email protected]>
Co-authored-by: Edward Hartnett <[email protected]>
Co-authored-by: Alex Richert <[email protected]>
Co-authored-by: Fabrice Ardhuin <[email protected]>
Co-authored-by: W. Erick Rogers <[email protected]>
Co-authored-by: Denise Worthen <[email protected]>
Co-authored-by: Camille Teicheira <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UG mesh crashes with ww3_multi.nml when loading input wind field
3 participants