Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container #678

Merged
merged 12 commits into from
Sep 16, 2024
Merged

Container #678

merged 12 commits into from
Sep 16, 2024

Conversation

aradhakrishnanGFDL
Copy link
Collaborator

Description
Container build in GitHub actions workflow, an accompanying Dockerfile also pushed.

How Has This Been Tested?
Tested at GFDL using podman. Local test instructions passed to @jtmims. Works on synthetic data and example POD. Works partially on forcing feedback, but replicates non-container workflow. Open to more tests and documentation.

@aradhakrishnanGFDL aradhakrishnanGFDL marked this pull request as draft August 30, 2024 16:50
@wrongkindofdoctor wrongkindofdoctor added feature-request New feature or request containers Podman, Singularity, and Docker containers labels Sep 3, 2024
@jtmims jtmims self-assigned this Sep 4, 2024
@jtmims
Copy link
Collaborator

jtmims commented Sep 4, 2024

Great news @aradhakrishnanGFDL! I was able to pull the docker package from the github repository and launch on my windows machine. Inside, I was able to run the example_multicase POD on some synthetic data, and it ran and plotted. It's functional! It's really cools seeing a container working on some data! I did have some trouble with tracking down micromamba/conda env related paths, but that is something that could be cleared up with some docs.

@aradhakrishnanGFDL
Copy link
Collaborator Author

@wrongkindofdoctor do you prefer merging this into main, or maintain a separate container branch until things seem to be working as expected? There is a CI workflow that builds a docker image and pushes to GitHub.io registry. It does not seem harmful to have it merge into main, except tests for the container as part of CI are not in right now.

@aradhakrishnanGFDL
Copy link
Collaborator Author

@jtmims @wrongkindofdoctor I tested the workflow with another PR that pushes to a container branch.

Here is the actions workflow -

https://github.com/NOAA-GFDL/MDTF-diagnostics/actions/runs/10891122710/job/30221357884

@jtmims you can now use this for the docs instead my ghrc.io pointers. Hope this helps.

https://github.com/noaa-gfdl/MDTF-diagnostics/pkgs/container/mdtf-diagnostics

@aradhakrishnanGFDL aradhakrishnanGFDL marked this pull request as ready for review September 16, 2024 20:10
@aradhakrishnanGFDL
Copy link
Collaborator Author

@wrongkindofdoctor do you prefer merging this into main, or maintain a separate container branch until things seem to be working as expected? There is a CI workflow that builds a docker image and pushes to GitHub.io registry. It does not seem harmful to have it merge into main, except tests for the container as part of CI are not in right now.

I have a container branch created. You can delete it as you see fit. I am okay with this PR being merged into main, in any case. I leave it to you and @jtmims.

@wrongkindofdoctor wrongkindofdoctor merged commit 9d4aab8 into NOAA-GFDL:main Sep 16, 2024
4 checks passed
@jtmims jtmims mentioned this pull request Nov 22, 2024
@jtmims jtmims mentioned this pull request Dec 17, 2024
wrongkindofdoctor added a commit that referenced this pull request Dec 19, 2024
* Container (#678)

* Create Dockerfile

works with synthetic example_multicase POD

* Update Dockerfile

* Update Dockerfile

* Create docker-build-and-push.yml

* Update docker-build-and-push.yml

* Update docker-build-and-push.yml

* Update docker-build-and-push.yml

* Update docker-build-and-push.yml

* Container Documentation (#687)

* Create container_config_demo.jsonc

* Create container_cat.csv

* Create container_cat.json

* Update container_config_demo.jsonc

* docs

* Update ref_container.rst

* Update ref_container.rst

* Update ref_container.rst

* Update ref_container.rst

* Update ref_container.rst

* Update dev_start.rst

* Update ref_container.rst

* Update dev_start.rst

* Update ref_container.rst

* Update doc/sphinx/dev_start.rst

Co-authored-by: Jess <[email protected]>

* Update doc/sphinx/ref_container.rst

Co-authored-by: Jess <[email protected]>

* Update doc/sphinx/ref_container.rst

Co-authored-by: Jess <[email protected]>

* Update doc/sphinx/ref_container.rst

Co-authored-by: Jess <[email protected]>

* Update doc/sphinx/dev_start.rst

Co-authored-by: Jess <[email protected]>

---------

Co-authored-by: Jess <[email protected]>

* Fix ci bugs (#688)

* fix unresolved conda_root ref in pod_setup
comment out no_translation setting for matching POD and runtime conventions for testing

* fix coord_name def in translate_coord

* define var_id separately in pp query

* change new_coord definition to obtain ordered dict instead of generator object in translation.create_scalar_name so that deepcopy can pickle it

* change logic in pod_setup to set translation object to no_translation only if translate_data is false in runtime config file

* uncomment more set1 pods that pass initial testing in
house

* add checks for no_translation data source and assign query atts using the var object instead of the var.translation object if True to preprocessor

* remove old comment from preprocessor

* change value for for hourly data search in datelabel get_timedelta_kwargs to return 1hr instead of hr so that the frequency for hourly data matchew required catalog specification

* comment out some set1 tests, since they are timing out on CI

* rename github actions test config files
split group 1 CI tests into 2 runs to avoid timeout issues

* update mdtf_tests.yml to reference new config file names and clean up deprecated calls

* update mdtf_tests.yml

* update matrix refs in mdtf_tests.yml

* revert changes to datelabel and move hr --> 1hr freq conversion to preprocessor

* delete old test files
just run 1 POD in set1 tests
try adding timeouts mdtf_tests.yml

* fix typo in timeout call in mdtf_tests

* fix GFDL entries in test catalogs

* fix varid entries for wvp in test catalogs

* change atmosphere_mass_content_of_water_vapor id from prw to wvp in gfdl field table

* comment out long_name check in translation.py

* define src_unit for coords if available in preprocessor.ConvertUnitsFunction
redefine dest_unit using var.units.units so that parm is a string instead of a Units.units object in call to units.convert_dataarray

* log warning instead of raising error if attr name doesn't match in xr_parser.compare_attr so that values can be converted later

* fix variable refs in xarray datasets in units.convertdatarray
add check to convert mb to hPa to convertdataarray

* fix frequency entries for static vars in test catalogs

* remove duplicate realm entries from stc_eddy_heat_fluxes settings file

* remove non alphanumeric chars from atts in xr_parser check_metadata

* comment out non-working PODs in set 3 tests

* Remove timeout lines and comment unused test tarballs in mdtf_tests.yml

* infer 'start_time' and 'end_time' from 'time_range' due to type issues (#691)

* infer 'start_time' and 'end_time' from 'time_range' due to type issues

* add warning

* fix ci issue

* move line setting date_range in query_catalog() (#693)

* move line setting date_range in query_catalog()

* cleanup print

* Remove modifier entry from areacello in trop_pac_sea_lev POD settings file

* Fix issues in pp query (#692)

* fix hr -> 1hr freq conversion in pp query
try using regex string contains standard_name in query

* add check for parameter type to xr_parser approximate_attribute_value

* remove regex from pp query standard_name

* add check that bounds is populated in cf.assessor, then check coord attrs and only run coord bounds check if bounda s are not None in xr_parser

* add escape brackets to command-line commands (#694)

* Fix convective_transition_diag POD (#695)

* fix ctd file formatting and typos

* more formatting and typo fixes in ctd POD

* uncomment convective transistion diag POD in 1a CI test config files

* try moving convective_transition_pod to ubuntu suite 2 tests

* add wkdir cleanup between each test run step and separate obs data fetching for set 1 tests in ci config file

* move convective_transition_diag POD to set 1b tests

* just run 1 POD in set 1a and 2 PODs in set 1b to avoid runner timeouts

* reorganize 1b tests

* add ua200-850 and va200-850 to gfld-cmor-tables (#696)

* add ice/ocean precip entries to GFDL fieldlist (#697)

* Add alternate standard names entry to fieldlists and varlistEntry objects (#699)

* add alternate_stanadard_names entries to precipitation_flux vars in CMIP and GFDL fieldlists
add list of applicable realms to preciptitation flux

* add alternate_standard_names attributes and property setters to DMDependentvariable class that is VarlistEntry parent class
define realm parm as string or list

* extend realm search in fieldlist lookup tables to use a realm list in the translation
add list to realm type hints in translation module

* extend standard_name query to list that includes alternate_standard_names if present in the translation object

* break up rainfall_flux and precipitation_flux entries in CMIP and GFDL field tables since translator can't parse realm list correctly

* revert realm type hints defined  as string or list and casting realm strings to listsin translation module

* change assertion to log errof if translation is None in varlist_util

* define new standard_name for pp xarray vars using the translation standard_name if the query standard name is a list with alternates instead of a string

* add function check_multichunk to fix issue with chunk_freqs (#701)

* add function check_multichunk to fix issue with chunk_freqs

* fix function comment

grammar grammar grammar

* move log warning

* add plots link to pod_error_snippet.html (#705)

* add plots link to pod_error_snippet.html

* remove empty line

* add variable table tool and put output into docs (#706)

* add variable table script to docs

* move file

* Delete tools/get_POD_varname/MDTF_Variable_Lists.html

* rework ref_vartable.rst to link directly to html file of the table (#707)

* rework ref_vartable.rst to link directly to html file of the table

* Delete doc/sphinx/MDTF_Variable_Lists.html

* Update MDTF_Variable_Lists.html

* remove example_pp_script.py from user_pp_scripts list in multirun_config_template.jsonc

* remove .nc files found in OUTPUT_DIR depending on config file (#710)

* fix formatting issues in output reference documentation (#711)

* fix forcing_feedback settings.jsonc formatting and remove extra freq entries

* Add check for user_pp_scripts attribute in config object to DaskMultifilePP init method

* add check for user_pp-scripts attr to execute_pp_functions

* update 'standard_name' for each var in write_pp_catalog (#713)

* Update docs about --env_dir flag (#715)

* Update README.md

* Update start_install.rst

* fix logic when defining log messages in pod_setup

* Fix dummy translation method in NoTranslationFieldlist (#717)

* define missing entries in dummy translation object returned by NoTranslationFieldlist.translate
add logic to determine alternate_standard_names attribute to NoTranslationFieldlist.translate

* set translate_data to false for testing

* edit logging message for no translation setting in pod_setup

* add todo to translation translate_coord and cleanup comments

* remove checks for no_translation from preprocessor

* define TranslatedVarlistEntry name attribute using data convention field table variable id

* revert debugging changes from test config file

* update docs for translate_data flag in the runtime config file

* fix variable_id and var_id refs in dummy translate method

* Reimplement crop date range capability (#718)

* add placeholder functions for date range cropping

* refine crop_date_range function. Need to figure out how to pass calendar from subset df

* continue reworking crop_date_range

* revert changes to check_group_daterange, and add check that input files overlap start and end times
add option aggregate=false to to_dataset_dict call
look into replaceing check_time_bounds with crop date range call before the xarray merge

* reorder crop_date_range call
add calls to parse xr time coord and define start and end times for dataset

* finalize logic in crop_date_range

* remove start_time and end_time from, and add time_range column to catalog generated by define_pp_catalog_assets

* replace start_time and end_time entries with time_range entries populated from information in processed xarray dataset in write_pp_catalog

* remove unused dask import from preprocessor

* replace hard coded time dimension name with var.T.name in call to xarray concatenate

* add check_time_bounds call back to query and fix definitions for modified start and end points so that they use the dataset information

* fix hour, min, sec defs in crop_date_range for new start and end times

* strip non-numeric chars from strings passed to _coerce_to_datetime

* add logic to define start and end points for situation where desired date range is contained by xarray dataset to crop_date_range

* Create drop attributes func (#720)

* fix forcing_feedback settings formatting

* add check for user_pp_scripts attribute before looping through list to multifilepreprocessor add_user_pp_scripts method

* add snakeviz to env_dev.yml

* move drop_atts loop to a separate function that is called by crop_date_range and before merging xradate_range and before merging datasets in query_catalog in the preprocessor

* Update mdtf dev env file (#722)

* add snakeviz, gprof2dot, and intake-esgf packages to env_dev file

* add viztracer to dev environment file

* add kerchunk package to dev environment

* Fix various pp issues related to running seaice_suite (#721)

* fix pp issues for seaice_suite

* fix arg issue

* rename functions

* add default return for conversion function

---------

Co-authored-by: Aparna Radhakrishnan <[email protected]>
Co-authored-by: Jess <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containers Podman, Singularity, and Docker containers feature-request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants