Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated doc about fixes and added type hints to fix functions #2160

Merged
merged 14 commits into from
Aug 9, 2023
Merged
5 changes: 4 additions & 1 deletion doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,9 @@
'autosummary': True,
}

# Show type hints in function signature AND docstring
autodoc_typehints = 'both'

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

Expand Down Expand Up @@ -385,7 +388,7 @@
# The format is a list of tuples containing the path and title.
# epub_pre_files = []

# HTML files shat should be inserted after the pages created by sphinx.
valeriupredoi marked this conversation as resolved.
Show resolved Hide resolved
# HTML files that should be inserted after the pages created by sphinx.
# The format is a list of tuples containing the path and title.
# epub_post_files = []

Expand Down
14 changes: 8 additions & 6 deletions doc/develop/fixing_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,12 +126,12 @@ Then we have to create the class for the fix deriving from
Next we must choose the method to use between the ones offered by the
Fix class:

- ``fix_file`` : should be used only to fix errors that prevent data loading.
- ``fix_file``: should be used only to fix errors that prevent data loading.
As a rule of thumb, you should only use it if the execution halts before
reaching the checks.

- ``fix_metadata`` : you want to change something in the cube that is not
the data (e.g variable or coordinate names, data units).
- ``fix_metadata``: you want to change something in the cube that is not
the data (e.g., variable or coordinate names, data units).

- ``fix_data``: you need to fix the data. Beware: coordinates data values are
part of the metadata.
Expand Down Expand Up @@ -253,6 +253,7 @@ with the actual data units.
.. code-block:: python

def fix_metadata(self, cubes):
cube = self.get_cube_from_list(cubes)
cube.units = 'real_units'


Expand All @@ -267,6 +268,7 @@ For example:
.. code-block:: python

def fix_metadata(self, cubes):
cube = self.get_cube_from_list(cubes)
cube.units = self.vardef.units

To learn more about what is available in these definitions, see:
Expand All @@ -289,7 +291,7 @@ missing coordinate you can create a fix for this model:
.. code-block:: python

def fix_metadata(self, cubes):
coord_cube = cubes.extract_strict('COORDINATE_NAME')
coord_cube = cubes.extract_cube('COORDINATE_NAME')
# Usually this will correspond to an auxiliary coordinate
#Β because the most common error is to forget adding it to the
# coordinates attribute
Expand All @@ -302,9 +304,9 @@ missing coordinate you can create a fix for this model:
}

# It may also have bounds as another cube
coord.bounds = cubes.extract_strict('BOUNDS_NAME').data
coord.bounds = cubes.extract_cube('BOUNDS_NAME').data

data_cube = cubes.extract_strict('VAR_NAME')
data_cube = cubes.extract_cube('VAR_NAME')
data_cube.add_aux_coord(coord, DIMENSIONS_INDEX_TUPLE)
return [data_cube]

Expand Down
26 changes: 14 additions & 12 deletions doc/recipe/preprocessor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,18 +136,20 @@ ESMValCore deals with those issues by applying specific fixes for those
datasets that require them. Fixes are applied at three different preprocessor
steps:

- fix_file: apply fixes directly to a copy of the file. Copying the files
is costly, so only errors that prevent Iris to load the file are fixed
here. See :func:`esmvalcore.preprocessor.fix_file`

- fix_metadata: metadata fixes are done just before concatenating the cubes
loaded from different files in the final one. Automatic metadata fixes
are also applied at this step. See
:func:`esmvalcore.preprocessor.fix_metadata`

- fix_data: data fixes are applied before starting any operation that will
alter the data itself. Automatic data fixes are also applied at this step.
See :func:`esmvalcore.preprocessor.fix_data`
- ``fix_file``: apply fixes directly to a copy of the file.
Copying the files is costly, so only errors that prevent Iris to load the
file are fixed here.
See :func:`esmvalcore.preprocessor.fix_file`.

- ``fix_metadata``: metadata fixes are done just before concatenating the
cubes loaded from different files in the final one.
Automatic metadata fixes are also applied at this step.
See :func:`esmvalcore.preprocessor.fix_metadata`.

- ``fix_data``: data fixes are applied before starting any operation that
will alter the data itself.
Automatic data fixes are also applied at this step.
See :func:`esmvalcore.preprocessor.fix_data`.

To get an overview on data fixes and how to implement new ones, please go to
:ref:`fixing_data`.
Expand Down
68 changes: 37 additions & 31 deletions esmvalcore/cmor/_fixes/fix.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
import inspect
import tempfile
from pathlib import Path
from typing import TYPE_CHECKING, Optional
from typing import TYPE_CHECKING, Any, Optional

from iris.cube import Cube, CubeList

from ..table import CMOR_TABLES

Expand All @@ -22,17 +24,17 @@ def __init__(
vardef: VariableInfo,
extra_facets: Optional[dict] = None,
session: Optional[Session] = None,
):
) -> None:
"""Initialize fix object.

Parameters
----------
vardef: VariableInfo
vardef:
CMOR table entry.
extra_facets: dict, optional
extra_facets:
Extra facets are mainly used for data outside of the big projects
like CMIP, CORDEX, obs4MIPs. For details, see :ref:`extra_facets`.
session: Session, optional
session:
Current session which includes configuration and directory
information.

Expand All @@ -48,7 +50,7 @@ def fix_file(
filepath: Path,
output_dir: Path,
add_unique_suffix: bool = False,
) -> Path:
) -> str | Path:
"""Apply fixes to the files prior to creating the cube.

Should be used only to fix errors that prevent loading or cannot be
Expand All @@ -57,24 +59,24 @@ def fix_file(

Parameters
----------
filepath: Path
filepath:
File to fix.
output_dir: Path
output_dir:
Output directory for fixed files.
add_unique_suffix: bool, optional (default: False)
add_unique_suffix:
Adds a unique suffix to `output_dir` for thread safety.

Returns
-------
Path
str or pathlib.Path
Path to the corrected file. It can be different from the original
filepath if a fix has been applied, but if not it should be the
original filepath.

"""
return filepath

def fix_metadata(self, cubes):
def fix_metadata(self, cubes: CubeList) -> CubeList:
"""Apply fixes to the metadata of the cube.

Changes applied here must not require data loading.
Expand All @@ -83,7 +85,7 @@ def fix_metadata(self, cubes):

Parameters
----------
cubes: iris.cube.CubeList
cubes:
Cubes to fix.

Returns
Expand All @@ -94,26 +96,30 @@ def fix_metadata(self, cubes):
"""
return cubes

def get_cube_from_list(self, cubes, short_name=None):
def get_cube_from_list(
self,
cubes: CubeList,
short_name: Optional[str] = None,
) -> Cube:
"""Get a cube from the list with a given short name.

Parameters
----------
cubes : iris.cube.CubeList
cubes:
List of cubes to search.
short_name : str or None
short_name:
Cube's variable short name. If `None`, `short name` is the class
name.

Raises
------
Exception
If no cube is found.
No cube is found.

Returns
-------
iris.Cube
Variable's cube
iris.cube.Cube
Variable's cube.

"""
if short_name is None:
Expand All @@ -123,14 +129,14 @@ def get_cube_from_list(self, cubes, short_name=None):
return cube
raise Exception(f'Cube for variable "{short_name}" not found')

def fix_data(self, cube):
def fix_data(self, cube: Cube) -> Cube:
"""Apply fixes to the data of the cube.

These fixes should be applied before checking the data.

Parameters
----------
cube: iris.cube.Cube
cube:
Cube to fix.

Returns
Expand All @@ -141,11 +147,11 @@ def fix_data(self, cube):
"""
return cube

def __eq__(self, other):
def __eq__(self, other: Any) -> bool:
"""Fix equality."""
return isinstance(self, other.__class__)

def __ne__(self, other):
def __ne__(self, other: Any) -> bool:
"""Fix inequality."""
return not self.__eq__(other)

Expand Down Expand Up @@ -173,18 +179,18 @@ def get_fixes(

Parameters
----------
project: str
project:
Project of the dataset.
dataset: str
dataset:
Name of the dataset.
mip: str
mip:
Variable's MIP.
short_name: str
short_name:
Variable's short name.
extra_facets: dict, optional
extra_facets:
Extra facets are mainly used for data outside of the big projects
like CMIP, CORDEX, obs4MIPs. For details, see :ref:`extra_facets`.
session: Session, optional
session:
Current session which includes configuration and directory
information.

Expand Down Expand Up @@ -248,12 +254,12 @@ def get_fixed_filepath(

Parameters
----------
output_dir: Path
output_dir:
Output directory for fixed files. Will be created if it does not
exist yet.
filepath: str or Path
filepath:
Original path.
add_unique_suffix: bool, optional (default: False)
add_unique_suffix:
Adds a unique suffix to `output_dir` for thread safety.

Returns
Expand Down
23 changes: 12 additions & 11 deletions esmvalcore/cmor/fix.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def fix_file(
add_unique_suffix: bool = False,
session: Optional[Session] = None,
**extra_facets,
) -> Path:
) -> str | Path:
"""Fix files before ESMValTool can load them.

This fixes are only for issues that prevent iris from loading the cube or
Expand All @@ -42,30 +42,31 @@ def fix_file(

Parameters
----------
file: Path
file:
Path to the original file.
short_name: str
short_name:
Variable's short name.
project: str
project:
Project of the dataset.
dataset: str
dataset:
Name of the dataset.
mip: str
mip:
Variable's MIP.
output_dir: Path
output_dir:
Output directory for fixed files.
add_unique_suffix: bool, optional (default: False)
add_unique_suffix:
Adds a unique suffix to `output_dir` for thread safety.
session: Session, optional
session:
Current session which includes configuration and directory information.
**extra_facets: dict, optional
**extra_facets:
Extra facets are mainly used for data outside of the big projects like
CMIP, CORDEX, obs4MIPs. For details, see :ref:`extra_facets`.

Returns
-------
Path:
str or pathlib.Path
Path to the fixed file.

"""
# Update extra_facets with variable information given as regular arguments
# to this function
Expand Down
Loading