-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UKMO Cray bugfixes #105
UKMO Cray bugfixes #105
Conversation
…hanged unit number from 100 to 110 to avoid Cray reserved unit number range.
…voids record length overflow for large dimension grids in WRITE statment with non-advancing I/O.
* Added cray_xc to the cmplr.env and w3_setup scripts. * Added cray_xc.CCE (Cray Compiler Envrionment) specific comp and link scripts. * Explicitly disable OMP in cray comp scripts for non-OMP compilation (Cray compiler enables by default). Note that is still enabled in link script as OMP library is always required by SCRIP code. * Updated matrix_ukmo_cray to run everything on a shared node using mpiexec - compilation is not efficient on compute nodes. * Add -eg switch (allows use of GOTO jumps into DO loops for SEC1 switch * Addition of GNU compilers on Cray architecture.
Thanks @ukmo-ccbunney. We are moving away from the platform/compiler-specific comp and link files towards using the cmplr.env and generic comp.tmpl and link.tmpl. Does this PR include changes to those files that would allow usage of the cmplr.env, comp.tmpl and link.tmpl without referring to any other external script/file? This would be a requirement for accepting this PR. In the near future we will eliminate all platform/compiler-specific comps and links from the package. |
@ajhenrique - yes the PR does include an update to cmplr.env to include "cray_xc". Would you prefer it if I removed the comp.cray_xc.CCE and link.cray_xc.CCE scripts? |
compilers to complain.
@ukmo-ccbunney please merge the auth repo develop into the branch associated with this PR. We can then move to get it merged into the NOAA-EMC/WW3 develop. Thanks! |
Develop branch has been merged in: 56857cd Do you want me to rerun the regtests matrix? |
Hi @ukmo-ccbunney @ajhenrique
The rests look ok to me. |
Hi @aliabdolali The differences in ww3_tp2.10 are expeced as the timestep was too large resulting in the model going unstable (when I ran the test originally, the output was mostly NaN values!) However, I cannot see reason why the output ww3_ts1 should be different. I have not changed anything to do with the source terms. I notice that the tab*.ww3 files are identical in the work directories, so the differences must be very small. I can't see why only the ww3_ts1 test should be affected. Could it be linked to the fact that the mww3* tests are not bit comparible? Chris. |
Hi @aliabdolali could you provide details of the differences you encountered in ww3_ts1? |
@ukmo-ccbunney @ajhenrique |
@ukmo-ccbunney that would be great if it doesn't affect your work. The idea will be to soon remove all individual comp and link files, not adding new ones unless they are essential is a good way to start cleaning up! Thanks. |
@aliabdolali, @ajhenrique |
@ajhenrique OK - I will remove the comp and link files for the Cray XC and add the cray_xc.GNU compiler setup to the cmplr.env file (I've already added the cray_xc.CCE one). One question regarding the compiler setups - why are the Intel and GNU compiler options set up to hardcode the byte order to "big-endian"? I would prefer to keep the native (little-endian in our case) byte order when compiling with the GNU compiler. Are you happy if I modify the GNU section of |
Hi @ukmo-ccbunney, from what I remember about the big-endian option, the aim was to avoid mixing the way binary files are created by ww3 on differents file systems to be able to safely exchange binary files. So it would be better to keep it as default except if it slow down your computation time. |
@maccensi I note that profiling (-p switch) is always enabled via the "common options". Is this intentional? |
There is almost no extra overhead using -p switch since it's just camping time functions. At least from I tested on our system and from what I read on the documentation about flat profiling but it could be good to test it again, I will do it |
|
@ukmo-ccbunney @mickaelaccensi I've read here and there that there may be a performance hit in runs over long periods of time using massive resources. I am currently running a test with our coupled system with and without the profiling flag, and will report back to this thread as well. |
@ukmo-ccbunney, I agree with @mickaelaccensi . Please indicate if this results in overheads for your system. |
@mickaelaccensi @ukmo-ccbunney I've completed a first set of tests running our coupled FV3-WW3 ensemble system. This is currently the operational cube sphere for FV3, and a mosaic of 3 grids in WW3 at 1/4 deg resolution (Arctic, core and Antarctic), with 30 perturbed members + 1 control, which are run out to 16 days. I used a Dell machine with 28 cores per node, the run had 14 ppn and 2 OMP threads, and a total of 38 nodes (532 MPI processes per member). With/without the -p flag the control ran ~61min/57min, and the perturbed members averaged ~69min/66min. There is small but noticeable shaving off of time in our system when -p is removed. |
@ajhenrique Did you remove the -p just for WW3 or both FV3 and WW3? |
@JessicaMeixner-NOAA, Only WW3. |
Adding more WW3 pets might further reduce the runtime since it seems that WW3 is potentially what's holding back the total timing. Or if you now hit the FV3 time, that -p could even further reduce your runtime from the amount you already saw. Either way it looks like good news for the GEFSv12 timings! |
that's a great news ! so it would be good to put the '-p-' option in a specific profiling section, with _prof suffix like it's done for debugging option with suffix _debug |
@mickaelaccensi that would be preferable. I can do that in a commit to our GEFS_v12 implementation branch and merge it back to develop. |
Excellent! Sounds like an easy win. :) |
I've just pushed up a commit (c134eff ) that removes the Met Office specific comp/link files and updates the cmplr.env with build configs for our Cray HPC using craytfn and gfortran. |
@aliabdolali what is the final recommendation for this PR? I've added both you and @JessicaMeixner-NOAA as reviewers. I look forward to both reviews. |
@ajhenrique The suspicious regression test behaves differently on our machine than UKMO's machine. As you can see, Chris has tested that test with the code before and after development and they were identical. So we either need to move on or we investigate more. The rest of the tests are OK. |
@ukmo-ccbunney @aliabdolali @JessicaMeixner-NOAA I reran the matrix for ww3_ts1 and had the following output in matrixCompSummary.out ********************* non-identical cases **************************** ************************ identical cases ***************************** I'm not sure what happened in the previous matrix run, but I'll consider this a done deal and will be ready to merge asap, given @aliabdolali accepted his review, if @JessicaMeixner-NOAA concurs in her review. |
Thanks everybody in this thread for your comments. @ukmo-ccbunney please close issues related to this PR, and make sure you back up some of the discussions above that may be useful elsewhere. Merging! |
Fantastic. Thanks everyone. |
* Removed incorrectly placed commas. * Added UNIT_AB variable to hold unit number for LOAD_ALPHABETA call. Changed unit number from 100 to 110 to avoid Cray reserved unit number range. * Fixed instability in ww3_tp2.10 regtest by decreasing CFL timestep * Added !/OMPG switches to OMP directives. * Added RECL specifier to OPEN statement enabled by switch /O2c. This avoids record length overflow for large dimension grids in WRITE statment with non-advancing I/O. * Changes for compilation and regtesting on UK Met Office Cray HPC: * Added cray_xc to the cmplr.env and w3_setup scripts. * Added cray_xc.CCE (Cray Compiler Envrionment) specific comp and link scripts. * Explicitly disable OMP in cray comp scripts for non-OMP compilation (Cray compiler enables by default). Note that is still enabled in link script as OMP library is always required by SCRIP code. * Updated matrix_ukmo_cray to run everything on a shared node using mpiexec - compilation is not efficient on compute nodes. * Add -eg switch (allows use of GOTO jumps into DO loops for SEC1 switch * Addition of GNU compilers on Cray architecture. * Fixed typo in wminitmd.ftn as rasied in NOAA-EMC#94 * Removed extra brackets around variable list that were causing some compilers to complain. * Removed UKMO comp/link scripts and updated cmplr.env accordingly
Changes required to run full regtest suite on Cray HPC using the crayftn compiler.
Adds the following fearures:
And addresses the following issues: