Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PnetCDF for HOMME on Anvil and Chrysalis #6552

Merged
merged 2 commits into from
Aug 22, 2024

Conversation

jayeshkrishna
Copy link
Contributor

Adding support for PnetCDF for HOMME standalone builds on
Anvil and Chrysalis

[BFB]

Adding support for the PnetCDF library in HOMME on Chrysalis. This
configuration files are used for standalone HOMME builds on
Chrysalis
Adding support for the PnetCDF library in HOMME on Anvil. This
configuration files are used for standalone HOMME builds on
Anvil
@jayeshkrishna jayeshkrishna added BFB PR leaves answers BFB Anvil HOMME standalone issues with the standalone HOMME code that dont impact E3SM Chrysalis labels Aug 16, 2024
@jayeshkrishna jayeshkrishna marked this pull request as ready for review August 16, 2024 00:22
@jayeshkrishna
Copy link
Contributor Author

This PR should fix the build issues with HOMME standalone tests on Chrysalis and Anvil after PR #6525

Copy link

PR Preview Action v1.4.7
🚀 Deployed preview to https://E3SM-Project.github.io/E3SM/pr-preview/pr-6552/
on branch gh-pages at 2024-08-16 00:23 UTC

@jayeshkrishna
Copy link
Contributor Author

We also have a ticket with LCRC systems to install newer versions of NetCDF Fortran library but that might take more time than this workaround of adding support for PnetCDF for HOMME standalone builds.

(Note: After PR #6525 we need NetCDF Fortarn library > 4.5.0 for SCORPIO builds without PnetCDF)

@oksanaguba
Copy link
Contributor

oksanaguba commented Aug 16, 2024

do we need to wrap these new lines in, say, if machine==chrysalis ? just trying to avoid new behaviour on machines we use otherwise, weaver, pm, mappy, frontier .

Update -- i misread the commit, yes, this is not an issue.

@jayeshkrishna
Copy link
Contributor Author

These cmake files are specific to the machines (this change should only affect chrysalis and anvil)

@rljacob
Copy link
Member

rljacob commented Aug 16, 2024

@mahf708 should the "scream defaults" tests above be running on this PR?

@mahf708
Copy link
Contributor

mahf708 commented Aug 16, 2024

@mahf708 should the "scream defaults" tests above be running on this PR?

It is set up to run on all PRs (and also regularly in the background) since it is a cheap small action. Perhaps, it shouldn't run on all PRs? The goal was to extend it to all of e3sm after a trial period (so that we can catch any issues in the inputdata server during PR reviews as well as in the background). Let me know if you want me to disable it (or at least lessen its frequency; it's run quite a lot by design... https://github.com/E3SM-Project/E3SM/actions/workflows/eamxx_default_files.yml)

@mt5555
Copy link
Contributor

mt5555 commented Aug 16, 2024

standalone HOMME has always been able to run with netcdf and/or pnetcdf. This PR is implying that standalone HOMME will now always require pnetcdf? (edit: after reading #6525, is this PR needed because the regular netcdf on LCRC is too old?)

IIUC, the build error is because of the "nf_64bit_data" type, which HOMME gets from SCORPIO. Is this a recent SCORPIO change? How did older versions of SCORPIO make this available when one was only building with netcdf, and wouldn't that older capability of SCORPIO be good to maintain?

pfs/fs1/home/e3smtest/jenkins/workspace/ACME_chrysalis_homme/E3SM/components/homme/utils/externals/scorpio/src/flib/pio_types.F90(281): error #6592: This symbol must be a defined parameter, an enumerator, or an argument of an inquiry function that evaluates to a compile-time constant. [NF_64BIT_DATA]
integer, public, parameter :: PIO_64BIT_DATA = nf_64bit_data
--------------------------------------------------^
/gpfs/fs1/home/e3smtest/jenkins/workspace/ACME_chrysalis_homme/E3SM/components/homme/utils/externals/scorpio/src/flib/pio_types.F90(281): error #6404: This name does not have a type, and must have an explicit type. [NF_64BIT_DATA]
integer, public, parameter :: PIO_64BIT_DATA = nf_64bit_data
--------------------------------------------------^
compilation aborted for /gpfs/fs1/home/e3smtest/jenkins/workspace/ACME_chrysalis_homme/E3SM/components/homme/utils/externals/scorpio/src/flib/pio_types.F90 (code 1)
\

@jayeshkrishna
Copy link
Contributor Author

jayeshkrishna commented Aug 20, 2024

standalone HOMME has always been able to run with netcdf and/or pnetcdf. This PR is implying that standalone HOMME will now always require pnetcdf? (edit: after reading #6525, is this PR needed because the regular netcdf on LCRC is too old?)

IIUC, the build error is because of the "nf_64bit_data" type, which HOMME gets from SCORPIO. Is this a recent SCORPIO change? How did older versions of SCORPIO make this available when one was only building with netcdf, and wouldn't that older capability of SCORPIO be good to maintain?

pfs/fs1/home/e3smtest/jenkins/workspace/ACME_chrysalis_homme/E3SM/components/homme/utils/externals/scorpio/src/flib/pio_types.F90(281): error #6592: This symbol must be a defined parameter, an enumerator, or an argument of an inquiry function that evaluates to a compile-time constant. [NF_64BIT_DATA] integer, public, parameter :: PIO_64BIT_DATA = nf_64bit_data --------------------------------------------------^ /gpfs/fs1/home/e3smtest/jenkins/workspace/ACME_chrysalis_homme/E3SM/components/homme/utils/externals/scorpio/src/flib/pio_types.F90(281): error #6404: This name does not have a type, and must have an explicit type. [NF_64BIT_DATA] integer, public, parameter :: PIO_64BIT_DATA = nf_64bit_data --------------------------------------------------^ compilation aborted for /gpfs/fs1/home/e3smtest/jenkins/workspace/ACME_chrysalis_homme/E3SM/components/homme/utils/externals/scorpio/src/flib/pio_types.F90 (code 1) \

This change is specific to Anvil and Chrysalis. So on these two machines HOMME standalone builds would need PnetCDF available (its already vailable on these machines, and is used by E3SM).
A recent update to the default NetCDF 64bit_data file type in CIME (moving from 64bit offset to 64bit data - this change is required for writing large variables and is already set by all production run scripts for ex) requires support for it in NetCDF Fortran libraries. The current version of the Fortran library in Anvil/Chrysalis does not have support for the type (The NetCDF C library on these machines do, so does PnetCDF) and we are in the process of updating the NetCDF Fortran library on Anvil/Chrysalis.
However adding support for PnetCDF on these machines lets HOMME standalone builds and standalone tests use the new NetCDF output type without upgrading the NetCDF Fortran libraries (Gets the HOMME standalone nightly tests running right away on these machines)

@jayeshkrishna
Copy link
Contributor Author

We have been thinking of updating the default NetCDF output format to 64bit_data for a while (and more and more users are running into the format constraints of the older NetCDF output format at runtime).

@jayeshkrishna
Copy link
Contributor Author

As long as you have relatively newer versions of NetCDF Fortran libraries available HOMME standalone tests should work/build
without any change.

@rljacob rljacob changed the title Adding support for PnetCDF for HOMME on Anvil and Chrysalis Add support for PnetCDF for HOMME on Anvil and Chrysalis Aug 21, 2024
@jayeshkrishna
Copy link
Contributor Author

@oksanaguba / @mt5555 : If you are ok with the change please go ahead and approve/merge this change (Merging this change will get the HOMME standalone nightly tests running again on Anvil/Chrysalis)

oksanaguba added a commit that referenced this pull request Aug 22, 2024
Adding support for PnetCDF for HOMME standalone builds on
Anvil and Chrysalis

[BFB]
@oksanaguba oksanaguba merged commit c6ac698 into master Aug 22, 2024
13 checks passed
@oksanaguba oksanaguba deleted the jayeshkrishna/homme/add_pnetcdf_for_standalone branch August 22, 2024 17:17
@oksanaguba
Copy link
Contributor

done

@jayeshkrishna
Copy link
Contributor Author

@oksanaguba : Was this PR merged to next?

@jayeshkrishna
Copy link
Contributor Author

@oksanaguba : Was this PR merged to next?

I see it now, thanks

oksanaguba added a commit that referenced this pull request Aug 26, 2024
…' (PR #6561)

Adding support for the PnetCDF library in HOMME for the
BFB tests on Anvil and Chrysalis.

Also see related PR #6552

[BFB]
oksanaguba added a commit that referenced this pull request Aug 26, 2024
…' (PR #6561)

Adding support for the PnetCDF library in HOMME for the
BFB tests on Anvil and Chrysalis.

Also see related PR #6552

[BFB]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Anvil BFB PR leaves answers BFB Chrysalis HOMME standalone issues with the standalone HOMME code that dont impact E3SM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants