-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure offline drivers #469
Conversation
0208a48
to
b2e57a7
Compare
@SeanBryan51 I've now changed the git history and one of the filenames as per our discussion today. Have a look at the changes and let me know if it's better now. |
Thanks @micaeljtoliveira. The rendered diff looks a lot better now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @micaeljtoliveira - cool to see Fortran OOP features in action for the MPI data types! I'm happy with all the functional changes you have put in here.
I've been a bit ruthless in my review on documenting things with Ford but I think its good to set a standard for everyone to make sure things are documented properly.
if(CABLE_MPI) | ||
add_executable( | ||
cable-mpi | ||
src/offline/cable_driver_init.F90 | ||
src/offline/cable_mpicommon.F90 | ||
src/offline/cable_mpidrv.F90 | ||
src/offline/cable_mpimaster.F90 | ||
src/offline/cable_mpiworker.F90 | ||
src/science/pop/pop_mpi.F90 | ||
"$<TARGET_OBJECTS:cable_common_objlib>" | ||
src/offline/cable_offline_driver.F90 | ||
) | ||
target_compile_definitions(cable-mpi PRIVATE __MPI__) | ||
target_link_libraries(cable-mpi PRIVATE PkgConfig::NETCDF MPI::MPI_Fortran) | ||
target_link_libraries(cable-mpi PRIVATE cable_common MPI::MPI_Fortran) | ||
install(TARGETS cable-mpi RUNTIME) | ||
else() | ||
add_executable( | ||
cable | ||
src/offline/cable_mpimaster_stub.F90 | ||
src/offline/cable_mpiworker_stub.F90 | ||
src/offline/cable_offline_driver.F90 | ||
) | ||
target_link_libraries(cable PRIVATE cable_common) | ||
install(TARGETS cable RUNTIME) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, this will break benchcab for the case both cable
and cable-mpi
executables are needed as benchcab will only invoke cmake
once to build both executables. However benchcab will still work with these changes for serial only (benchcab fluxsite
) and mpi only (benchcab spatial
) tests (FYI @ccarouge).
It would be good to think about the proper CMake workflow for building different variants of the cable executable (e.g. MPI on/off, Debug, Release, compiler). From reading online, it seems the typical approach for doing this is to generate separate build trees for each variant unless we work with a multiconfig build system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems the typical approach for doing this is to generate separate build trees for each variant
Yes, I think that's the usual strategy. CABLE is quite fast to compile, so this should not be an issue.
One thing you might want to explore to make this process easier is the use of CMake presets. These are really nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #485
- routines to initialise and finalise MPI - a simple class to handle MPI groups. All the contents of the module can handle the case where no MPI support is available and/or serial runs (i.e., use of only one rank), thus removing the need to fence some operations with preprocessing directives outside of this module. To get CMake to properly propagate the link dependencies, I had to convert the common object library into a static library.
1249c60
to
e6b19d9
Compare
No worries. I think I've now addressed all your comments. Please have a look and let me know if I missed something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a question about the unified approach. What happens if someone launches the serial executable via mpirun using several processors by mistake? Wasted resources since it will run on 1 proc but "charge" the resources of several?
Testing the following:
Looks like it invokes the serial executable in 4 processors in parallel. |
I've tested this with |
There is now only one executable being built: either cable-mpi or cable, depending if MPI support is available or not. When MPI support is available, the code will now automatically run the serial driver if using only one MPI process, instead of stopping with an error like before.
e6b19d9
to
8a3d3bb
Compare
At the moment this does not really change the current behavior, as there are still two executables. But it might be a source of confusion if later on there's only one executable. That said, from personal experience, it's usually quite obvious when one launches a non-MPI exec with MPI, as the output is duplicated for each MPI rank, as @SeanBryan51 just confirmed. |
I've tested these changes with PBS log file:
Benchcab version:
Spatial (MPI) runs all execute as expected (note outputs do not reproduce as per issue #463). |
This change further consolidates the initialisation logic within the offline drivers to remove code duplication and improve code clarity across each of the drivers (see #425, #440 and #469). As part of the refactoring changes, new optional namelist options (`filename%new_sumbal` and `filename%trunk_sumbal`) are introduced to replace hard coded filenames and optional namelist option `CASAonly` is restored for consistency with the documentation. ## Type of change - [x] Code refactor - [x] New feature - [x] New or updated documentation ## Checklist - [x] The new content is accessible and located in the appropriate section. - [x] I have checked that links are valid and point to the intended content. - [x] I have checked my code/text and corrected any misspellings <!-- readthedocs-preview cable start --> ---- 📚 Documentation preview 📚: https://cable--472.org.readthedocs.build/en/472/ <!-- readthedocs-preview cable end -->
…#498) This change further consolidates the initialisation logic within the offline drivers (in particular logic related to `cable_user%MetType`) to remove code duplication and improve code clarity across each of the drivers (see #425, #440, #469 and #472). As part of the refactoring changes, the `cable_user%MetType = 'gpgs'` namelist parameter is deprecated as it is equivalent to setting `cable_user%MetType = 'gswp'` and `leaps = .TRUE.`. ## Type of change - [x] Code refactor - [x] New or updated documentation ## Checklist - [x] The new content is accessible and located in the appropriate section. - [x] I have checked that links are valid and point to the intended content. - [x] I have checked my code/text and corrected any misspellings <!-- readthedocs-preview cable start --> ---- 📚 Documentation preview 📚: https://cable--498.org.readthedocs.build/en/498/ <!-- readthedocs-preview cable end -->
CABLE
Description
Restructure offline drivers such that there is only one executable being build, regardless of MPI support being available or not. Also add the option to run with only one MPI process when MPI support is available (this falls back to the serial driver).
To achieve this, a new mpi_grt_t class is introduced, that hides several calls to the MPI library and takes care of providing the necessary infrastructure when MPI support is not available.
Type of change
Please delete options that are not relevant.
Checklist
📚 Documentation preview 📚: https://cable--469.org.readthedocs.build/en/469/