Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix to ad3 when running parallel make #39

Merged
merged 1 commit into from
Jun 19, 2019

Conversation

ukmo-ccbunney
Copy link
Collaborator

When running parallel make, it is possible under certain circumstancs for ad3 to delete/move compiled MOD files too soon.

This is only an issue if the compiler does not provide a switch to allow a different directory to specified to output MOD files. In this case, the ad3 program moves the MOD files from the scratch directory to the mod directory manually. Howerer, this uses a *.mod file glob which can erronously move/delete files from other compilation processes when running a parallel make.

As the compiler may output MOD files with a filename that may be upper/lower/mixed case, it is necessary to use a case insensitive grep to locate the correct MOD file. A find -iname works better, but the -iname switch to find may not be available on all systems.

running parallel make. Only an issue if compiler does not allow switch
to target a different directory to output mod files.
@ajhenrique
Copy link
Collaborator

@mickaelaccensi @JessicaMeixner-NOAA please have a look at this pull request and validate.

Copy link
Collaborator

@JessicaMeixner-NOAA JessicaMeixner-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this commit as long as it passes the regression tests at EMC.

@mickaelaccensi
Copy link
Collaborator

Hi Chris,
What do you mean by parallel make ? Are you sure we don't need to also the remove mod files of the subroutines used ? How can we reproduce your error ?

@ukmo-ccbunney
Copy link
Collaborator Author

Hi Mickael,
By parallel make, I mean when the "make" command is invoked with the "-j" flag to perform multiple compilations at once. This is now the default when calling w3_make (controlled by the WW3_PARCOMPN environment variable which defaults to 4).

The error can be reproduced by using compiler flags that do not move the "*.mod" files to a different directory (i.e. the mod directory). For instance, for the Cray Fortran Compiler, using "-I $path_m" instead of "-J $path_m" informs the compiler to look in the mod directory, but not place the compiled *.mod files there. In thius case, the ad3 program moves the compiled .mod files to the mod directory. This is the point where the parallel compilation can fail.

You could argue that this is not a bug as using the "-J" compiler flag (in Cray Fortran) fixes the issue. However, the funcitonality exists in ad3 to do this - and it breaks in parallel make if that functionality is used.

So it is a bug - but one that probably no one will encounter if using the correct compiler flags on a modern Fortran compiler!

Cheers,
Chris.

@mickaelaccensi
Copy link
Collaborator

ok, thanks for the clear exclanation. I agree with this bugfix

@ajhenrique
Copy link
Collaborator

I am updating the batch queue options for running the regtests matrix at NCEP. Once that is done I'll check the proposed changes and report to reviewers so we can hopefully move ahead with approval of this pull request.

@ajhenrique ajhenrique changed the base branch from develop to HF_ad3 June 19, 2019 17:13
@ajhenrique ajhenrique merged commit 00196cc into NOAA-EMC:HF_ad3 Jun 19, 2019
aliabdolali pushed a commit that referenced this pull request Jun 24, 2019
…39) (#55)

running parallel make. Only an issue if compiler does not allow switch
to target a different directory to output mod files.
@ukmo-ccbunney ukmo-ccbunney deleted the bf_paramake branch September 16, 2019 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants