Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds base parallel IO module to OMEGA #43

Merged
merged 3 commits into from
Jan 18, 2024

Conversation

philipwjones
Copy link

@philipwjones philipwjones commented Jan 3, 2024

The base IO layer for OMEGA

  • includes OMEGA-aware wrappers around SCORPIO for writing both data and meta data to be used by IOStreams
  • adds a unit test that writes/reads arrays and metadata to/from a test file
  • adds documentation for both user and developer guides

This only performs I/O from the host and is not meant as the primary interface for I/O, just a layer between IOStreams
and Scorpio that can be used while IOStreams is developed. The interface changed a little since the Decomp PR so
this also includes some interface changes within Decomp.

All unit tests pass on Chrysalis with Intel. Will test on other platforms/compilers.

Checklist

  • Documentation:
    • Design document has been generated and added to the docs
    • User's Guide has been updated
    • Developer's Guide has been updated
    • Documentation has been built locally and changes look as expected
  • Testing
    • A comment in the PR documents testing used to verify the changes including any tests that are added/modified/impacted.
    • CTest unit tests for new features have been added per the approved design.
    • Unit tests have passed. Please provide a relevant CDash build entry for verification.

Copy link

@grnydawn grnydawn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philipwjones , I successfully ran testIO on both of Frontier and Chrysalis.

NOTE: For testing, I built and used libmetis.a and libparmetis.a. However, I encountered an issue due to not adding libGKlib.a in the linking process. So, I temporarily added libGKlib.a. Since I am new to using these libraries, it's possible that the issue with the missing libGKlib.a might not be a problem for other developers.

@philipwjones
Copy link
Author

@grnydawn Yes, there are multiple versions of metis/parmetis The github version has an outdated cmake setup in addition to splitting out the GKlib. For those reasons, we've been using the release tarball rather than the github version. I suspect we'd probably want to use a spack installation long-term as part of Xylar's overall development environment but I haven't looked into that one yet to see how that is configured.

@xylar
Copy link

xylar commented Jan 9, 2024

@grnydawn and @philipwjones, we should have a discussion sometime soon about how to handle the Omega dependencies.

In Compass and Polaris, I've been building dependencies using Spack. This might be the best way to handle them in E3SM, too. Currently E3SM has a ton of submodules and builds a lot of its dependencies when it gets built. While that might simplify the process for supporting new machines, it makes build times long (and redundant).

Metis (and therefore GKlib) is certainly available from Spack:
https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/metis/package.py

Also, Spack will work with shared libraries by default but can build static libraries if we want that for some reason. It looks like maybe static libraries are currently assumed based on the filenames above?

@xylar
Copy link

xylar commented Jan 9, 2024

I am guessing we can assume we will be working in "standalone" mode in Polaris for awhile so Polaris can use Spack to build any dependencies that Omega doesn't build itself for the time being.

@brian-oneill
Copy link

I've built and run on Chrysalis and Frontier. While ctest doesn't register any failure, looking at the logs the string tests fail in all circumstances, e.g:

[2024-01-09 10:23:36.439] [info] [IOTest.cpp:544] IOTest: read/write file metadata string test FAIL
and
[2024-01-09 10:23:36.633] [info] [IOTest.cpp:775] IOTest: read/write var metadata string test FAIL

@grnydawn
Copy link

grnydawn commented Jan 9, 2024

@philipwjones, I created a PR(#44) to add FindParmetis.cmake that search parmetis, metis, and optionally gklib libraries. For now, it is assumed that all three libraries are static libraries and user should provide OMEGA_PARMETIS_ROOT, at minimum, in the CMake command-line.

@grnydawn
Copy link

grnydawn commented Jan 9, 2024

@xylar , @philipwjones , I agree that we need to discuss the Omega dependencies. Current Omega build system pulls all the dependency information from the E3SM machine file. So, when we need to update the Omega dependencies, I think, we will eventually need to update E3SM machine file too.

@grnydawn grnydawn self-requested a review January 9, 2024 17:09
@grnydawn
Copy link

grnydawn commented Jan 9, 2024

@brian-oneill , @philipwjones , It was a nice catch to identify this bug. 'IO_TEST' should be added in 'set_tests_properties'. I have requested code changes.

I've built and run on Chrysalis and Frontier. While ctest doesn't register any failure, looking at the logs the string tests fail in all circumstances, e.g:

[2024-01-09 10:23:36.439] [info] [IOTest.cpp:544] IOTest: read/write file metadata string test FAIL and [2024-01-09 10:23:36.633] [info] [IOTest.cpp:775] IOTest: read/write var metadata string test FAIL

@philipwjones
Copy link
Author

Thanks for catching @brian-oneill and @grnydawn I'll push fixes soon

@philipwjones
Copy link
Author

@brian-oneill @grnydawn I pushed fixes for both the ctest issue and the string metadata read/writes. The latter required a change to readMeta and writeMeta interfaces. They now are aliased based on data type and no longer use the void pointer - probably better that way anyway.

  - includes OMEGA-aware wrappers around SCORPIO for writing
    both data and meta data to be used by IOStreams
  - adds a unit test that reads/writes a test file
  - adds documentation for both user and developer guides
@philipwjones
Copy link
Author

@grnydawn I've brought this branch up to date with the recently merged cmake changes so if you could re-review, I think I've made all the changes you requested.

Copy link

@grnydawn grnydawn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philipwjones the update looks good to me. My ctest run on Chrysalis was successful. Frontier seems not responding now, but the build was successful too.

Copy link

@brian-oneill brian-oneill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retested with latest changes on Chrysalis and Frontier, all good.

@philipwjones philipwjones merged commit 60b8f55 into E3SM-Project:develop Jan 18, 2024
2 checks passed
@philipwjones philipwjones deleted the omega/io branch January 24, 2024 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants