Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Add CDAT API mapping table and gallery examples #239

Merged
merged 29 commits into from
Jun 10, 2022

Conversation

tomvothecoder
Copy link
Collaborator

@tomvothecoder tomvothecoder commented May 24, 2022

Description

Preview of API Reference Page: https://xcdat.readthedocs.io/en/docs-238-tutorials/api.html#cdat-mapping-table
Preview of Gallery: https://xcdat.readthedocs.io/en/docs-238-tutorials/gallery.html

Summary of Changes

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

@tomvothecoder tomvothecoder added the type: docs Updates to documentation label May 24, 2022
@tomvothecoder tomvothecoder self-assigned this May 24, 2022
@tomvothecoder tomvothecoder changed the title Add CDAT mapping table and xCDAT examples Add CDAT API mapping table and user guide notebooks May 24, 2022
@codecov-commenter
Copy link

codecov-commenter commented May 24, 2022

Codecov Report

Merging #239 (40ec05e) into main (1fdc8a9) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main      #239   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            9         8    -1     
  Lines          742       721   -21     
=========================================
- Hits           742       721   -21     
Impacted Files Coverage Δ
xcdat/utils.py 100.00% <ø> (ø)
xcdat/__init__.py 100.00% <100.00%> (ø)
xcdat/axis.py 100.00% <100.00%> (ø)
xcdat/bounds.py 100.00% <100.00%> (ø)
xcdat/dataset.py 100.00% <100.00%> (ø)
xcdat/spatial.py 100.00% <100.00%> (ø)
xcdat/temporal.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 68d8ae1...40ec05e. Read the comment docs.

docs/api.rst Show resolved Hide resolved
@tomvothecoder tomvothecoder changed the title Add CDAT API mapping table and user guide notebooks [DOCS] Add CDAT API mapping table and user guide notebooks May 25, 2022
@tomvothecoder tomvothecoder changed the title [DOCS] Add CDAT API mapping table and user guide notebooks [DOC] Add CDAT API mapping table and user guide notebooks May 25, 2022
docs/api.rst Outdated
Comment on lines 173 to 235
``cf_xarray`` for interpreting CF-compliant attributes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``xcdat`` leverages ``cf_xarray`` for interpreting CF-compliant attributes (``.attrs``) present on ``xarray.DataArray`` or ``xarray.Dataset`` objects.

.. pull-quote::
One powerful feature of ``cf_xarray`` is the ability to refer to named dimensions by standard ``axis`` or ``coordinate`` names in Dataset or DataArray methods.

-- https://cf-xarray.readthedocs.io/en/latest/coord_axes.html#axes-and-coordinates:

.. code-block:: python

from cf_xarray.datasets import airds

airds.cf

Coordinates:
- CF Axes: * X: ['lon']
* Y: ['lat']
* T: ['time']
Z: n/a

- CF Coordinates: * longitude: ['lon']
* latitude: ['lat']
* time: ['time']
vertical: n/a

- Cell Measures: area: ['cell_area']
volume: n/a

- Standard Names: * latitude: ['lat']
* longitude: ['lon']
* time: ['time']

- Bounds: n/a

Data Variables:
- Cell Measures: area, volume: n/a

- Standard Names: air_temperature: ['air']

- Bounds: n/a


.. list-table::
:widths: 20 40 40
:header-rows: 1

* - How do I...
- cf_xarray
- CDAT
* - Get latitude coordinates (Y axis)?
- ``Dataset.cf["Y"]``, ``Dataset.cf["lat"]``, or ``Dataset.cf["latitude"]``
- ``TransientVariable.getLatitude()``
* - Get longitude coordinates (X axis)?
- ``Dataset.cf["X"]``, ``Dataset.cf["lon"]``, or ``Dataset.cf["longitude"]``
- ``TransientVariable.getLongitude()``
* - Get time coordinates (T axis)?
- ``Dataset.cf["T"]`` or ``Dataset.cf["time"]``
- ``TransientVariable.getTime()``
* - Get vertical coordinates (Z axis)?
- ``Dataset.cf["Z"]``
- ``TransientVariable.getLevel()``
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @durack1, let me know how this looks based on your suggestion from a previous comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, this information might be best placed in another repo (e.g., xcdat_utils) since the APIs here aren't from xCDAT.

@tomvothecoder tomvothecoder force-pushed the docs/238-tutorials branch 2 times, most recently from b3b4289 to 6f674b7 Compare June 1, 2022 16:53
@tomvothecoder tomvothecoder changed the title [DOC] Add CDAT API mapping table and user guide notebooks [DOC] Add CDAT API mapping table and gallery examples Jun 1, 2022
@tomvothecoder
Copy link
Collaborator Author

@tomvothecoder tomvothecoder requested review from pochedls and lee1043 June 1, 2022 18:48
@tomvothecoder tomvothecoder marked this pull request as ready for review June 1, 2022 18:52
@lee1043
Copy link
Collaborator

lee1043 commented Jun 1, 2022

@tomvothecoder this is really great, thank you for adding this suite of documents.

  • I think it is great that notebooks use input data from ESGF node directly.

  • Temporal average notebook

    • I like the part you used xmovie for the movie gif and provided the few lines of code to generate it, which I think it's very useful.
    • I wonder if you can use higher frequency input. While I think it is fine to use the monthly input for freq=season or year, using daily input for freq=month may demonstrate the functionality better. Likewise using 3-hourly input for freq=day may demonstrate better. Or you can leave a note if using higher frequency input data slows down too much because of its bigger file size.
  • Spatial average notebook

    • In its cell [9] for defining Nino 3.4 region, could you update the lat lon region as follows:
      • "Niño 3.4 (5N-5S, 170W-120W): The Niño 3.4 anomalies may be thought of as representing the average equatorial SSTs across the Pacific from about the dateline to the South American coast. The Niño 3.4 index typically uses a 5-month running mean, and El Niño or La Niña events are defined when the Niño 3.4 SSTs exceed +/- 0.4C for a period of six months or more." (from NCAR)

@tomvothecoder
Copy link
Collaborator Author

Thanks for the feedback @lee1043!

I wonder if you can use higher frequency input. While I think it is fine to use the monthly input for freq=season or year, using daily input for freq=month may demonstrate the functionality better. Likewise using 3-hourly input for freq=day may demonstrate better. Or you can leave a note if using higher frequency input data slows down too much because of its bigger file size.

Your suggestion makes sense to me. I'll try using daily and 3-hr inputs for those freq args. I realized file size isn't an issue if we commit notebooks with cells that are already executed so our documentation build doesn't have to do it at runtime.

"Niño 3.4 (5N-5S, 170W-120W): The Niño 3.4 anomalies may be thought of as representing the average equatorial SSTs across the Pacific from about the dateline to the South American coast. The Niño 3.4 index typically uses a 5-month running mean, and El Niño or La Niña events are defined when the Niño 3.4 SSTs exceed +/- 0.4C for a period of six months or more." (from NCAR)

I noticed that sub-setting for Niño 3.4 was done with lat_bounds=(-5, 5) and lon_bounds=(190, 240) in our older notebooks and other examples online, rather than lat_bounds(-5, 5) and lon_bounds=(170, 120).

Do you know why there might be a discrepancy with those region values and the ones defined by NCAR?

@lee1043
Copy link
Collaborator

lee1043 commented Jun 3, 2022

@tomvothecoder thanks for your reply.

I noticed that sub-setting for Niño 3.4 was done with lat_bounds=(-5, 5) and lon_bounds=(190, 240) in our older notebooks and other examples online, rather than lat_bounds(-5, 5) and lon_bounds=(170, 120).

Do you know why there might be a discrepancy with those region values and the ones defined by NCAR?

170W-120W is actually equivalent to 190-240 from the 0-360 longitude range. 170W indicates 170 degree westward from 0 (=360), thus 360-170=190. Likewise, 120W = 360-120 = 240.

In the current notebook, cell [9] is:

ds_nino_avg = ds.spatial.average("tas", lat_bounds=(-25, 25), lon_bounds=(-190, 240))

which is need to be replaced by

ds_nino_avg = ds.spatial.average("tas", lat_bounds=(-5, 5), lon_bounds=(190, 240))

In above

  • lat_bounds is changed from (-25, 25) to (-5, 5)
  • -190 in lon_bounds is changed to 190.

@pochedls
Copy link
Collaborator

pochedls commented Jun 3, 2022

This looks great - thanks Tom!

general-utilities

  • At cell 6: Since the data is already oriented from 0 to 360$^\mathrm{o}$ E, I think we should update this to to=(-180, 180)
    • As a side note, there is some weird behavior here: the longitude axis goes from size 360 to 361 (and one set of lon_bnds goes from 0 to 0). I'm not sure if this is specific to converting from one longitude coordinate system to the same system (something people wouldn't normally do) or a more generic issue. This doesn't happen when converting to (-180, 180).
  • It could be useful to add a very brief summary under each header
    • Open a dataset: Datasets can be opened and read using open_dataset or open_mfdataset
    • Reorient the longitude axis: Longitude can be represented from 0 to 360 E or as 180 W to 180 E. xcdat allows you to convert between these axes systems.
    • Center the time coordinates: A given point of time often represents some time period (e.g., a monthly average). In this situation, data providers sometimes record the time as the beginning, middle, or end of the period. center_times() places the time coordinate in the center of the time interval (using time bounds to determine the center of the period).
    • Add bounds: Bounds are critical to many xcdat operations. For example, they are used in determining the weights in spatial or temporal averages and in regridding operations. add_bounds() will attempt to produce bounds if they do not exist in the original dataset.

Calculate Geospatial Weighted Averages from Monthly Time Series

  • I'm working on some suggestions for the overview. Do you think it is necessary to try to document the details of the algorithm in the code gallery? I think some details are useful, but we might want to leave out some of the mechanics.
  • Just a comment on the Nino 3.4 conversation. spatial_average should give the same result for longitude domains of [190, 240] and [-170, -120]!

@tomvothecoder
Copy link
Collaborator Author

Thanks @lee1043 and @pochedls, both of your feedback is helpful. I'll apply your fixes and suggestions!

At cell 6: Since the data is already oriented from 0 to 360 E, I think we should update this to to=(-180, 180)

  • As a side note, there is some weird behavior here: the longitude axis goes from size 360 to 361 (and one set of lon_bnds goes from 0 to 0). I'm not sure if this is specific to converting from one longitude coordinate system to the same system (something people wouldn't normally do) or a more generic issue. This doesn't happen when converting to (-180, 180).

Nice catch, and I'll need to dig deeper in this issue. Maybe we need to add a catch for cases where the user attempts to convert their longitude axis to the same system (e.g., pass statement, logger statement).

It could be useful to add a very brief summary under each header

That's a great suggestion, thanks for providing the summaries!

I'm working on some suggestions for the overview. Do you think it is necessary to try to document the details of the algorithm in the code gallery? I think some details are useful, but we might want to leave out some of the mechanics.

I agree, we can have general details over how the spatial averaging works. We can leave out some of the lower-level details (e.g., how weights are generated for each axis) since they user can go directly to the method docstrings for that information if they are interested. I can see the verbosity of the lower-level info possibly being overwhelming for users.

@pochedls
Copy link
Collaborator

Some suggested text for Overview in the Calculate Geospatial Weighted Averages from Monthly Time Series Gallery entry:

A common data reduction in geophysical sciences is to produce spatial averages. Spatial averaging functionality in xcdat allows users to quickly produce area-weighted spatial averages for selected regions (or full dataset domains).

In the example below, we demonstrate the opening of a (remote) dataset and spatial averaging over the global, tropical, and Niño 3.4 domains.

Let me know if you think this has too little detail.

Copy link
Collaborator Author

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional doc updates

README.rst Outdated Show resolved Hide resolved
docs/index.rst Outdated Show resolved Hide resolved
@tomvothecoder
Copy link
Collaborator Author

Some suggested text for Overview in the Calculate Geospatial Weighted Averages from Monthly Time Series Gallery entry:

A common data reduction in geophysical sciences is to produce spatial averages. Spatial averaging functionality in xcdat allows users to quickly produce area-weighted spatial averages for selected regions (or full dataset domains).
In the example below, we demonstrate the opening of a (remote) dataset and spatial averaging over the global, tropical, and Niño 3.4 domains.

Let me know if you think this has too little detail.

Looks good to me. I implemented your changes. I will do a final review and merge.

README.rst Outdated Show resolved Hide resolved
docs/index.rst Outdated Show resolved Hide resolved
README.rst Outdated Show resolved Hide resolved
docs/index.rst Outdated Show resolved Hide resolved
docs/index.rst Outdated Show resolved Hide resolved
README.rst Outdated Show resolved Hide resolved
README.rst Outdated Show resolved Hide resolved
docs/index.rst Outdated Show resolved Hide resolved
README.rst Outdated Show resolved Hide resolved
docs/index.rst Outdated Show resolved Hide resolved
README.rst Outdated Show resolved Hide resolved
docs/index.rst Outdated Show resolved Hide resolved
docs/index.rst Outdated Show resolved Hide resolved
README.rst Outdated Show resolved Hide resolved
@tomvothecoder tomvothecoder merged commit 092854a into main Jun 10, 2022
@tomvothecoder tomvothecoder deleted the docs/238-tutorials branch June 10, 2022 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: docs Updates to documentation
Projects
None yet
5 participants