Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDAT Migration Phase 2: Refactor lat_lon set #658

Closed
30 tasks done
tomvothecoder opened this issue Feb 21, 2023 · 1 comment · Fixed by #677
Closed
30 tasks done

CDAT Migration Phase 2: Refactor lat_lon set #658

tomvothecoder opened this issue Feb 21, 2023 · 1 comment · Fixed by #677
Assignees
Labels
cdat-migration-fy24 CDAT Migration FY24 Task Hard A subjectively hard task.

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Feb 21, 2023

/## Overview
The components to refactor include the driver, the plotter, and the viewer. Each file has a set of functions to refactor and module references that might need to be refactored as well.

We will use the "bubble context" refactoring method for this task.

File 1 - lat_lon_driver.py

Order of function calls: run_diag() -> create_metrics() -> create_and_save_data_and_metrics()

  • Refactor utils.dataset.Dataset class methods to operate on xr.Dataset/xr.DataArray objects
    a. Create skeleton methods for methods that need to be factored (append _new to name)
    b. Add failing unit tests for these methods
    c. Implement new methods
    d. Update references from old methods to new methods
  • Refactor general utilities that are called by the driver to operate on xarray objects (e.g., climo.py)
    a. Create a utils_new.py
    b. Add skeleton function definitions for new general utilities
    c. Add failing unit tests for these functions
    d. Implement new general utilities
    e. Update references from old utilities to new utilities
  • Refactor create_and_save_data_and_metrics() and create_metrics()
    a. These functions call e3sm_diags.metrics functions

Module References - tree diagram

These are the other modules referenced in lat_lon_driver.py that need to be refactored to operate on xr.DataArray/xr.Dataset objects.

  • e3sm_diags.utils
    • dataset_new.py (class Dataset) - Refactor methods that call cdms2 (e.g., cdms2.open() )
      • get_static_variable()
      • get_attr_from_climo()
      • get_climatology_variable()
        • _get_climo_var()
        • get_timeseries_var()
        • _get_var_from_timeseries_file()
        • e3sm_diags/driver/utils/climo.py -> create climo_xr.py
    • io.py (replaces I/O functions in general.py)
      • get_output_dir()
      • save_ncfiles()
      • get_name_and_yrs() -- added as a method to Dataset class
    • regrid.py (replaces regridding functions in general.py)
      • (NEW) has_z_axis (replaces cdms2.axis.getLevel())
      • regrid_to_lower_res() - 100% identical results
      • convert_to_pressure_levels() -- Really close (1e-6 max abs and 1e-8 max rel diffs)
      • select_region() -- Really close (1e-7 max abs and rel diffs without land/sea region masking lower limit)
        • Opened an xcdat GitHub discussion post to figure out why xesmf and cdms2 esmf regridder produces different regridded land sea masks -- caused by cdms2 not importing esmf properly, has since been patched
        • Replaced by _apply_land_sea_mask() and _subset_on_region()
  • e3sm_diags.metrics_xr.py (replaces metrics/__init__.py)
    • corr -- virtually identical (1e-14 max abs and rel diffs)
    • mean - virtually identical (1e-12 max abs and 1e-14 max rel diffs)
    • rmse - virtually identical (1e-13 max abs and 1e-14 max rel diffs)
    • std - virtually identical (1e-14 max abs and 1e-15 max rel diffs)

File 2 - lat_lon_plot.py

Functions to Refactor

  • e3sm_diags.plot.plot()
    • _get_plot_fnc()
      • lat_lon_plot.plot() -- moved sub-functions below to plot/utils.py
        • plot_panel()
        • determine_tick_step()
        • get_ax_size()
        • add_cyclic()

Module References

  • e3sm_diags.derivations.default_regions - region_specs
  • e3sm_diags.driver.utils.general - get_output_dir()
  • e3sm_diags.plot - get_color_map()

File 3 - default_viewer.py (backlogged for a later time, once all sets have been refactored).

Functions to Refactor

  • create_viewer() -> seasons_used(), _get_description(), _add_to_lat_lon_metrics_table(), create_metadata(), and _add_information_to_viewer()

Module References

  • e3sm_diags.parser - SET_TO_PARSER
  • e3sm_diags.viewer.utils - add_header(), h1_to_h3()
  • e3sm_diags.viewer.lat_lon_viewer - generate_lat_lon_metrics_table(), generate_lat_lon_taylor_diag(), generate_lat_lon_cmip6_comparison(), generate_lat_lon_metrics_table()
@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Apr 24, 2023

Notes from 4/11/23 Meeting:

Performing validation on the ocean fraction climatology calculations between old and new climo function. Next steps:

  1. Try to figure out why climatology outputs have bigger than expected diffs (floating point error? use of np.sum vs. np.einsum?)
  2. Try to get results as close as possible
  3. Check if the slice_flag implementation is needed to add an extra coordinate point in the new Dataset method for subsetting time series variables (does it help improve results significantly?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cdat-migration-fy24 CDAT Migration FY24 Task Hard A subjectively hard task.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant