From 8107cc8739c2a99ce0cb1c9010287f96da1684ab Mon Sep 17 00:00:00 2001 From: Matthew Middlehurst Date: Wed, 6 Nov 2024 23:36:08 +0200 Subject: [PATCH] [DOC] Update developer documentation (#2297) * write local documentation dev guide * dev docs * dev docs * fixes * fixes * fix forecasting testing (they still fail though) --------- Co-authored-by: aadya940 --- docs/about.md | 69 +++-- docs/contributing.md | 12 +- .../{reporting_bugs.md => issues.md} | 2 +- docs/developer_guide.md | 47 +--- docs/developer_guide/aep.md | 12 +- docs/developer_guide/coding_standards.md | 8 +- .../developer_guide/continuous_integration.md | 63 ----- docs/developer_guide/dependencies.md | 83 +++--- docs/developer_guide/deprecation.md | 51 +++- docs/developer_guide/documentation.md | 202 +++++++++----- docs/developer_guide/release.md | 48 +++- docs/developer_guide/testing.md | 247 +++++++++++++++++- pyproject.toml | 3 + 13 files changed, 569 insertions(+), 278 deletions(-) rename docs/contributing/{reporting_bugs.md => issues.md} (98%) delete mode 100644 docs/developer_guide/continuous_integration.md diff --git a/docs/about.md b/docs/about.md index 6e9c39f07a..09c051957e 100644 --- a/docs/about.md +++ b/docs/about.md @@ -49,12 +49,20 @@ The core developers push forward `aeon`'s development and maintain the package. ```{include} about/core_developers.md ``` +## Affiliation + +`aeon` is an affiliated project of [NumFOCUS](https://numfocus.org/). + +![https://numfocus.org/](images/other_logos/numfocus-logo.png){w=300px} + ## History `aeon` was started in January 2023 as a fork of the `sktime` project by 8 core developers using [v0.16.0](https://github.com/aeon-toolkit/aeon/releases/tag/sktime-v0.16.0) -as a base. In following year, the project grew to include an additional 4 core -developers and was accepted as a NumFOCUS affiliated project in December 2023. +as a base. In the following years, the project has grown to include many more core +developers, had a complete revamp of governance structure, and relaunched numerous +modules for time series learning tasks. `aeon` was accepted as a NumFOCUS affiliated +project in December 2023. ## Artwork @@ -70,7 +78,7 @@ The logo was designed by [Reni Rahayu](https://www.instagram.com/kojodesignandco `aeon` is a community-driven project. However, institutional and private grants help to ensure its sustainability. -The project developers would like to thank the following funders. +The project developers would like to thank the following funders: ```{list-table} :widths: 50 50 @@ -78,26 +86,55 @@ The project developers would like to thank the following funders. * - - -* - The [UKRI Engineering and Physical Sciences Research Council (EPSRC)](https://gow.epsrc.ukri.org/NGBOViewGrant.aspx?GrantRef=EP/W030756/1) funds Matthew Middlehurst ({user}`matthewmiddlehurst`) and Tony Bagnall ({user}`TonyBagnall`) since 2022 +* - The [UKRI Engineering and Physical Sciences Research Council (EPSRC)](https://gow.epsrc.ukri.org/NGBOViewGrant.aspx?GrantRef=EP/W030756/1) funds Matthew Middlehurst ({user}`matthewmiddlehurst`) and Tony Bagnall ({user}`TonyBagnall`) between 2022-2025 - ![https://epsrc.ukri.org](images/funder_logos/ukri-epsrc-logo.png) ``` +Short-term funding (<6 months) for internships has been provided by the following +organisations: + +```{list-table} +:header-rows: 1 + +* - Name + - GitHub ID + - Organization + - Year +* - Divya Tiwari + - {user}`itsdivya1309` + - [Google Summer of Code](https://summerofcode.withgoogle.com) + - 2024 +* - Aadya Chinubhai + - {user}`aadya940` + - [Google Summer of Code](https://summerofcode.withgoogle.com) + - 2024 +* - Gabriel Riegner + - {user}`griegner` + - [Google Summer of Code](https://summerofcode.withgoogle.com) + - 2024 +* - Ivan Knyazev + - {user}`IRKnyazev` + - [EPSRC](https://gow.epsrc.ukri.org/NGBOViewGrant.aspx?GrantRef=EP/W030756/1) + - 2024 +* - Daniele Carli + - {user}`Moonzyyy` + - [EPSRC](https://gow.epsrc.ukri.org/NGBOViewGrant.aspx?GrantRef=EP/W030756/1) + - 2024 +``` + +Google Summer of Code (GSoC) sponsored internships applied and contributed to `aeon` +projects via the shared NumFOCUS application for the program. + ## Infrastructure We would also like to thank [GitHub Actions](https://github.com/features/actions) and [ReadtheDocs](https://readthedocs.org) for the free compute time on their servers and documentation hosting. -## Affiliation - -`aeon` is an affiliated project of [NumFOCUS](https://numfocus.org/). - -![https://numfocus.org/](images/other_logos/numfocus-logo.png){w=300px} - ## Pre-fork Acknowledgements -
sktime v0.16.0 core developers +
`sktime` v0.16.0 core developers

The following listed contributors were part of the `sktime` core developer team at some @@ -127,7 +164,7 @@ point prior to the split of the project.

-
sktime v0.16.0 funders +
`sktime` v0.16.0 funders

As a fork of the `sktime` project, `aeon` has benefited from funding given to `sktime` @@ -143,13 +180,13 @@ prior to the projects split. We would like to thank the funders from before the - ![https://turing.ac.uk/](images/funder_logos/ati-logo.png) * - Markus Löning’s ({user}`mloning`) contributions between 2019 and 2021 were supported by the [UKRI Economic and Social Research Council (ESRC)](https://esrc.ukri.org), the [Consumer Data Research Centre (CDRC)](https://www.cdrc.ac.uk), the Enrichment Scheme at the [The Alan Turing Institute](https://turing.ac.uk), and the JROST Rapid Response Fund, a community effort of [Invest in Open Infrastructure](https://investinopen.org). - ![https://esrc.ukri.org](images/funder_logos/ukri-esrc-logo.png) ![https://www.cdrc.ac.uk](images/funder_logos/cdrc-logo.png) ![https://turing.ac.uk/](images/funder_logos/ati-logo.png) -* - Mercedes-Benz AG/Daimler AG donated 2500 EUR to support the maintenance and development of sktime in 2021, as part of their [FOSS program](https://opensource.mercedes-benz.com). +* - Mercedes-Benz AG/Daimler AG donated 2500 EUR to support the maintenance and development of `sktime` in 2021, as part of their [FOSS program](https://opensource.mercedes-benz.com). - ![https://opensource.mercedes-benz.com](images/funder_logos/mercedes-benz-logo.png) ``` __Sprints__ -The 2019 joint sktime/MLJ development sprint was kindly hosted by +The 2019 joint `sktime`/`MLJ` development sprint was hosted by [UCL](https://www.ucl.ac.uk) and [The Alan Turing Institute](https://turing.ac.uk). Some participants could attend thanks to the initial funding of the [The Alan Turing Institute](https://turing.ac.uk). @@ -158,9 +195,9 @@ __Internships__ [Google Summer of Code (GSoC)](https://summerofcode.withgoogle.com), [Major League Hacking](https://mlh.io) and [Outreachy](https://www.outreachy.org) -have all sponsored sktime internships. +have all sponsored `sktime` internships. -The [Wellcome Trust](https://wellcome.org) sponsored one sktime internship as part of +The [Wellcome Trust](https://wellcome.org) sponsored one `sktime` internship as part of Outreachy. ```{list-table} diff --git a/docs/contributing.md b/docs/contributing.md index a7aba434f0..009714f0f8 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -40,7 +40,9 @@ address. If you are unsure about any feedback, please ask for clarification. making a contribution! Make sure you are included in the [list of contributors](contributors.md). Further guidance for contributing to `aeon` via GitHub can be found on the -[developer guide](developer_guide.md). +[developer guide](developer_guide.md). It is not necessary to read everything here prior to +contributing, but if your issue to related to a specific topic i.e. documentation or +testing you may find it useful. If your intended method of contribution does not fit into the above steps, please reach out to us on [Slack](https://join.slack.com/t/aeon-toolkit/shared_invite/zt-22vwvut29-HDpCu~7VBUozyfL_8j3dLA) @@ -89,11 +91,11 @@ Developer Guide :::{grid-item-card} :text-align: center -Reporting Bugs +Opening Issues ^^^ -Guidance for reporting bugs in `aeon`. +Guidance for issues and reporting bugs in `aeon`. +++ @@ -102,7 +104,7 @@ Guidance for reporting bugs in `aeon`. :click-parent: :expand: -Reporting Bugs +Opening Issues ``` ::: @@ -132,5 +134,5 @@ Mentoring and Projects ```{toctree} :hidden: -contributing/reporting_bugs.md +contributing/issues.md ``` diff --git a/docs/contributing/reporting_bugs.md b/docs/contributing/issues.md similarity index 98% rename from docs/contributing/reporting_bugs.md rename to docs/contributing/issues.md index d57aee6bcd..c8717b4eb4 100644 --- a/docs/contributing/reporting_bugs.md +++ b/docs/contributing/issues.md @@ -1,4 +1,4 @@ -# Reporting Bugs and Opening Issues +# Opening Issues and Reporting Bugs We use [GitHub issues](https://github.com/aeon-toolkit/aeon/issues) to track all bugs and feature requests; feel free to open an issue if you have found a bug or wish to see diff --git a/docs/developer_guide.md b/docs/developer_guide.md index cd6cfda9ab..54bd15659f 100644 --- a/docs/developer_guide.md +++ b/docs/developer_guide.md @@ -1,8 +1,8 @@ # Developer Guide Welcome to the `aeon` developer guide. This guide is intended for new developers and -current developers who want to learn about specific topics for code and non-code -developments. +current developers who want to learn about specific topics for both code and non-code +project development. For a step-by-step guide for setting up a development version of `aeon` and creating a pull request, see the [contributing guide](contributing.md). At any point @@ -20,24 +20,6 @@ their [developer's guide](https://scikit-learn.org/stable/developers/index.html) :::{grid-item-card} :text-align: center -Adding Estimators - -^^^ - -A guide to creating new `aeon` estimators. - -+++ - -```{button-ref} developer_guide/add_estimators -:color: primary -:click-parent: -:expand: - -Adding Estimators -``` - -::: - :::{grid-item-card} :text-align: center @@ -83,27 +65,6 @@ Coding Standards :::{grid-item-card} :text-align: center -CI/CD - -^^^ - -A description of the `aeon` CI/CD pipeline. - -+++ - -```{button-ref} developer_guide/continuous_integration -:color: primary -:click-parent: -:expand: - -CI/CD -``` - -::: - -:::{grid-item-card} -:text-align: center - Dependencies ^^^ @@ -232,14 +193,12 @@ Testing ```{toctree} :hidden: -developer_guide/add_estimators.md developer_guide/aep.md developer_guide/coding_standards.md -developer_guide/continuous_integration.md developer_guide/dependencies.md developer_guide/deprecation.md developer_guide/dev_installation.md developer_guide/documentation.md developer_guide/release.md -developer_guide/testing_framework.md +developer_guide/testing.md ``` diff --git a/docs/developer_guide/aep.md b/docs/developer_guide/aep.md index 82aa98815b..2266cb4288 100644 --- a/docs/developer_guide/aep.md +++ b/docs/developer_guide/aep.md @@ -1,12 +1,12 @@ # `aeon` Enhancement Proposals -## Description - An `aeon` enhancement proposal (AEP) is a software design document providing information to the aeon community. The proposal should provide a rationale and concise technical specification of the proposed design. -We collect and discuss proposals in the `aeon` AEP [repository](https://github.com/aeon-toolkit/aeon-admin/tree/main/aep). +We collect and discuss proposals in the `aeon` AEP [repository](https://github.com/aeon-toolkit/aeon-admin/). +In-progress AEPs can be found in the [pull requests](https://github.com/aeon-toolkit/aeon-admin/pulls) +section. Completed AEPs can be found [here](https://github.com/aeon-toolkit/aeon-admin/tree/main/aep). We intend AEPs to be the primary mechanisms for proposing major changes such as new modules and collecting community input on large or controversial issues. Smaller @@ -24,3 +24,9 @@ consolidated document, including: * a concise problem statement, * a clear description of the proposed solution, * a comparison with alternative solutions. + +The AEP will remain open until the proposal is accepted or rejected. After being opened +the proposal then can be implemented (possibly through multiple steps) through pull +requests in the main repository. It is up to the core developers to determine when an +AEP is considered completed, in-progress or rejected. For items such as new modules, +the AEP will be considered complete when the module is no longer experimental. diff --git a/docs/developer_guide/coding_standards.md b/docs/developer_guide/coding_standards.md index 2a0f431eac..e5f38d6abd 100644 --- a/docs/developer_guide/coding_standards.md +++ b/docs/developer_guide/coding_standards.md @@ -37,11 +37,9 @@ Additional configurations for some hooks can be found in the [pyproject.toml](ht ### `aeon` specific code formatting conventions -- Check out our [glossary](glossary.md) for -preferred terminology - Use underscores to separate words in non-class names i.e.`n_cases` rather than -`n_cases`. -- Exceptionally, capital letters `X`, `Y`, `Z`, are permissible as variable names or +`ncases`, `nCases` or similar. +- Exceptionally, capital letters i.e. `X` are permissible as variable names or part of variable names such as `X_train` if referring to data sets. - Use absolute imports for references inside `aeon`. - Don’t use `import *` in the source code. It is considered harmful by the official @@ -59,7 +57,7 @@ clone: includes `pre-commit`: ```{code-block} powershell -pip install -e .[dev] +pip install --editable .[dev] ``` 2. Set up pre-commit: diff --git a/docs/developer_guide/continuous_integration.md b/docs/developer_guide/continuous_integration.md deleted file mode 100644 index 6e9798e050..0000000000 --- a/docs/developer_guide/continuous_integration.md +++ /dev/null @@ -1,63 +0,0 @@ -# Continuous integration - -We use continuous integration services on GitHub to automatically check -if new pull requests do not break anything and meet code quality -standards such as a common [coding standards](developer_guide/coding_standards.md). -Before setting up Continuous Integration, be sure that you have set -up your developer environment, and installed a [development version](developer_guide/dev_installation.md) -of aeon. - -## Code quality checks - -We use [pre-commit](https://pre-commit.com) for code quality checks (a process we also refer to as "linting" checks). - -We recommend that you also set this up locally as it will ensure that you never run into code quality errors when you make your first PR! -These checks run automatically before you make a new commit. -To setup, simply navigate to the aeon folder and install our pre-commit configuration: - -```{code-block} powershell -pre-commit install -``` - -`pre-commit` should now automatically run anything you make a commit! Please let us know if you encounter any issues getting this setup. - -For a detailed guide on code quality and linting for developers, see [coding_standards](developer_guide/coding_standards.md). - -## Unit testing - -We use [pytest](https://docs.pytest.org/en/latest/) for unit testing. - -To check if your code passes all tests locally, you need to install the development version of `aeon` and all extra dependencies. - -1. Install the development version of `aeon` with developer dependencies: - -```{code-block} powershell -pip install -e .[dev] -``` - - This installs an editable [development version](https://pip.pypa.io/en/stable/topics/local-project-installs/#editable-installs) - of aeon which will include the changes you make. - - For trouble shooting on different operating systems, please see our detailed - [installation instructions](installation.md). - -2. To run all unit tests, run: - -```{code-block} powershell -pytest ./aeon -``` - -## Test coverage - -We use [coverage](https://coverage.readthedocs.io/), the [pytest-cov](https://github.com/pytest-dev/pytest-cov) plugin, and [codecov](https://codecov.io) for test coverage. - -## Infrastructure - -This section gives an overview of the infrastructure and continuous -integration services we use. - -| Platform | Operation | Configuration | -| -------------- | -------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | -| GitHub Actions | Build/test/distribute on Linux, MacOS and Windows, run code quality checks | [.github/workflows/](https://github.com/aeon-toolkit/aeon/tree/main/.github/workflows) | -| Read the Docs | Build/deploy documentation | [.readthedocs.yml](https://github.com/aeon-toolkit/aeon/blob/main/.readthedocs.yml) | -| Codecov | Test coverage | [.codecov.yml](https://github.com/aeon-toolkit/aeon/blob/main/.codecov.yml), [.coveragerc](https://github.com/aeon-toolkit/aeon/blob/main/.coveragerc) | diff --git a/docs/developer_guide/dependencies.md b/docs/developer_guide/dependencies.md index 785f48b7fe..53c0f326fd 100644 --- a/docs/developer_guide/dependencies.md +++ b/docs/developer_guide/dependencies.md @@ -1,42 +1,59 @@ # Dependencies -## Types of dependencies - There are three types of dependencies in `aeon`: **core**, **soft**, or **developer**. +- **Core** dependencies are required to install and run `aeon` and are automatically +installed with `aeon` i.e. `scikit-learn` and `numpy` +- **Soft** dependencies are only required to import certain modules, but not necessary +to use most functionalities. A soft dependency is not installed automatically with the +package unless an extra dependency set i.e. `all_extras` is used. +- **Developer** dependencies are required for `aeon` developers, but not for typical +users of `aeon` i.e. `pytest` and `pre-commit`. Documentation dependencies are also +included in this category. - * **Core** dependencies are required to install and run `aeon` and are automatically installed with `aeon`, *e.g.* `pandas`; - * **Soft** dependencies are only required to import certain modules, but not necessary to use most functionalities. A soft dependency is not installed automatically with the package. Instead, users need to install it manually if they want to use a module that requires a soft dependency, *e.g.* `pmdarima`; - * **Developer** dependencies are required for `aeon` developers, but not for typical users of `aeon`, *e.g.* `pytest`. - +We are unlikely to add new core dependencies, without a strong reason. Soft dependencies +should be the first choice for new dependencies, but ideally the code should be written +in `aeon` itself if possible. -We try to keep the number of core dependencies to a minimum and rely on other packages as soft dependencies when feasible. +Al dependencies are managed in the [`pyproject.toml`](https://github.com/aeon-toolkit/aeon/blob/main/pyproject.toml) +file following the [PEP 621](https://www.python.org/dev/peps/pep-0621/) convention. +Core dependencies are listed in the `dependencies` dependency set and +developer dependencies are listed in the `dev` and `docs` dependency sets. ## Adding a soft dependency -Soft dependencies in `aeon` should usually be restricted to estimators. - -When adding a new soft dependency or changing the version of an existing one, the following files need to be updated: - -- [pyproject.toml](https://github.com/aeon-toolkit/aeon/blob/main/pyproject.toml), adding the dependency or version bounds in the `all_extras` dependency set. Following the [PEP 621](https://www.python.org/dev/peps/pep-0621/) convention, all dependencies including build time dependencies and optional dependencies are specified in this file. - -Informative warnings or error messages for missing soft dependencies should be raised, in a situation where a user would need them. This is handled through our [`_check_soft_dependencies` utility](https://github.com/aeon-toolkit/aeon/blob/main/aeon/utils/validation/_dependencies.py). - -There are specific conventions to add such warnings in estimators, as below. To add an estimator with a soft dependency, ensure the following: - -- imports of the soft dependency only happen inside the estimator, e.g., in `_fit` or `__init__` methods of the estimator. In `__init__`, imports should happen only after calls to `super(cls).__init__`. -- the `python_dependencies` tag of the estimator is populated with a `str`, or a `list` of `str`, of import dependencies. Exceptions will automatically be raised when constructing the estimator in an environment without the required packages. -- In a case where the package import differs from the package name, i.e., `import package_string` is different from `pip install different-package-string` (usually the case for packages containing a dash in the name), the `_check_soft_dependencies` utility should be used in `__init__`. Both the warning and constructor call should use the `package_import_alias` argument for this. -- If the soft dependencies require specific python versions, the `python_version` tag should also be populated, with a PEP 440 compliant version specification `str` such as `"<3.10"` or `">3.6,~=3.8"`. -- If including docstring examples that use soft dependencies, ensure to skip doctest. To do this add a `# doctest: +SKIP` to the end of each line in the doctest to skip. Check out the arima estimator as an example. If concerned that skipping the test will reduce test coverage, consider exposing the doctest example as a pytest test function instead, see below how to handle soft dependencies in pytest functions. -- Decorate all pytest tests that import soft dependencies with a `@pytest.mark.skipif(...)` conditional on a check to `_check_soft_dependencies` for your new soft dependency. Be sure that all soft dependencies which are imported for testing are imported within the test function itself, rather than for the whole module! This decorator will then skip your test unless the system has the required packages installed. Doing this is helpful for any users running `check_estimator` on all estimators, or a full local `pytest` run without the required soft dependency. - -## Adding a core or developer dependency - -Core or developer dependencies can be added only by core developers after discussion and consensus. When adding a new core dependency or changing the version of an existing one, the following files need to be updated: - -- [`pyproject.toml`](https://github.com/aeon-toolkit/aeon/blob/main/pyproject.toml), adding the dependency or version bounds in the `dependencies` dependency set. - -When adding a new developer dependency or changing the version of an existing one, the following files need to be updated: - -- [`pyproject.toml`](https://github.com/aeon-toolkit/aeon/blob/main/pyproject.toml), adding the dependency or version bounds in the `dev` dependency set. +Soft dependencies in `aeon` should usually be restricted to the classes, functions +and/or modules which require them. Using any other part of the package should not +require the soft dependency. + +Any new soft dependency needs to be added to the `all_extras` dependency set. +`unstable_extras` should be used instead if the dependency is unstable to install for +whatever reason i.e. it requires extra compilers to be installed or is only available +for specific operating systems. The vast majority of users should be able to install +`all_extras` without any issues. + +Informative warnings or error messages for missing soft dependencies should be raised, +in a situation where a user would need them. This is handled through our +[`_check_soft_dependencies` utility](https://github.com/aeon-toolkit/aeon/blob/main/aeon/utils/validation/_dependencies.py). + +There are specific conventions to add such warnings in estimators. +To add an estimator with a soft dependency, ensure the following: + +- Imports of the soft dependency only happen inside the estimator, e.g., in `_fit` or +`__init__` methods of the estimator. In `__init__`, imports should happen only after +calls to `super(cls).__init__`. +- The `python_dependencies` tag of the estimator is populated with a `str`, or a `list` +of `str` for each dependency. Exceptions will automatically be raised when constructing +the estimator in an environment without the required packages. +- In a case where the package import differs from the package name, i.e., +`import package_string` is different from `pip install different-package-string` +(usually the case for packages containing a dash in the name), the +`_check_soft_dependencies` utility should be used in `__init__`. Both the warning and +constructor call should use the `package_import_alias` argument for this. +- If the soft dependencies require specific python versions, the `python_version` tag +should also be populated, with a PEP 440 compliant version specification `str` such as +`"<3.10"` or `">3.6,~=3.8"`. +- Decorate all pytest tests that import soft dependencies with a +`@pytest.mark.skipif(...)` conditional on a check to `_check_soft_dependencies` for your +new soft dependency. This decorator will then skip your test unless the system has the +required packages installed. diff --git a/docs/developer_guide/deprecation.md b/docs/developer_guide/deprecation.md index 6e6bfca537..a4b294b86a 100644 --- a/docs/developer_guide/deprecation.md +++ b/docs/developer_guide/deprecation.md @@ -1,37 +1,57 @@ # Deprecation Policy -`aeon` [releases](https://github.com/aeon-toolkit/aeon/releases) follow [semantic versioning](https://semver.org). A release number denotes `..` versions. +`aeon` [releases](https://github.com/aeon-toolkit/aeon/releases) follow [semantic versioning](https://semver.org). A release number +denotes `..` versions. -Broadly, if a change could unexpectedly cause code using `aeon` to crash when updating to the next version, then it should be deprecated to give the user a chance to prepare. +Broadly, if a change could unexpectedly cause code using `aeon` to crash when updating +to the next version, then it should be deprecated to give the user a chance to prepare. When to deprecate: - Removal or renaming of public classes or functions - Removal or renaming of public class parameters or function arguments - Addition of positional arguments without default values -Deprecation warnings should be included for at least one full minor version cycle before change or removal. If an item is deprecated on the release of v0.6.0, it can be removed in v0.7.0. If an item is deprecated between v0.6.0 and v0.7.0 (i.e. v0.6.1), it can be removed in v0.8.0. +Deprecation warnings should be included for at least one full minor version cycle before +change or removal. If an item is deprecated on the release of `v0.6.0`, it can be +removed in `v0.7.0`. If an item is deprecated between `v0.6.0` and `v0.7.0` +(i.e. `v0.6.1`), it can be removed in `v0.8.0`. -Note that the deprecation policy does not necessarily apply to modules we class as still experimental. Currently experimental modules are: +Note that the deprecation policy does not necessarily apply to modules we class as still +experimental. Currently experimental modules are: - `anomaly_detection` - `benchmarking` +- `forecasting` - `segmentation` - `similarity_search` -- `testing` -- `transformations/series` - `visualisation` -When we introduce a new module, we may classify it as experimental until the API is stable. We will try to not make drastic changes to experimental modules, but we need to retain the freedom to be more agile with the design in these cases. +When we introduce a new module, we may classify it as experimental until the API is +stable. We will try to not make drastic changes to experimental modules, but we need +to retain the freedom to be more agile with the design in these cases. ## Deprecation Process -To deprecate functions and classes, write a "TODO" comment stating the version the code should be removed in and raise a warning using the [deprecated package](https://deprecated.readthedocs.io/en/latest/index.html). This raises a `FutureWarning` saying that the functionality has been deprecated. Import from `deprecated.sphinx` so the deprecation message is automatically added to the documentation. +To deprecate functions and classes, write a "TODO" comment stating the version the code +should be removed in and raise a warning using the [deprecated package](https://deprecated.readthedocs.io/en/latest/index.html). This +raises a `FutureWarning` saying that the functionality has been deprecated. Import +from `deprecated.sphinx` so the deprecation message is automatically added to the +documentation. -When renaming items, the functionality should ideally already be available with the new name when the deprecation warning is added. For example, including both the old and new name for a positional argument, or both functions/classes with the old and new names. This is not always possible, but it is good practice to do so. +When renaming items, the functionality should ideally already be available with the new +name when the deprecation warning is added. For example, including both the old and new +name for a positional argument, or both functions/classes with the old and new names. +This is not always possible, but it is good practice to do so. -In most cases not necessary to use the `deprecated` package when renaming or removing function and class keyword arguments. The default value of the argument can be set to `"deprecated"`. If this value is changed, a `FutureWarning` can be raised. This isolates the deprecation warning to the argument, rather than the whole function or class. If renaming, the new keyword argument can be added alongside this, with the warning directing users to use the new keyword argument. +In most cases not necessary to use the `deprecated` package when renaming or removing +function and class keyword arguments. The default value of the argument can be set to +`"deprecated"`. If this value is changed, a `FutureWarning` can be raised. This +isolates the deprecation warning to the argument, rather than the whole function or +class. If renaming, the new keyword argument can be added alongside this, with the +warning directing users to use the new keyword argument. -If the next version number has not been decided, use the next minor version number for the deprecated package `version` parameter. +If the next version number has not been decided, use the next minor version number +for the deprecated package `version` parameter. ## Examples @@ -53,7 +73,8 @@ def my_function(x, y): return x + y ``` -Deprecate a function to add a new positional argument. In certain cases, you can add a keyword argument with a default value, to ease the transition. +Deprecate a function to add a new positional argument. In certain cases, you can add a +keyword argument with a default value, to ease the transition. ```python from deprecated.sphinx import deprecated @@ -92,7 +113,8 @@ class MyClass: Deprecate a class. -Since this example is deprecated on a patch release, it cannot be removed from the next minor release. +Since this example is deprecated on a patch release, it cannot be removed from the next +minor release. ```python from deprecated.sphinx import deprecated @@ -108,7 +130,8 @@ class MyClass: pass ``` -Deprecate a public class attribute. If we are renaming, we could add the new name and direct users to use that instead while updating both. +Deprecate a public class attribute. If we are renaming, we could add the new name and +direct users to use that instead while updating both. ```python from deprecated.sphinx import deprecated diff --git a/docs/developer_guide/documentation.md b/docs/developer_guide/documentation.md index 90f8e19e2a..1d22363cb6 100644 --- a/docs/developer_guide/documentation.md +++ b/docs/developer_guide/documentation.md @@ -2,130 +2,200 @@ `aeon`'s documentation standards include: -* Documenting code using NumPy docstrings and `aeon` conventions -* Following ``aeon``'s docstring convention for public code artifacts and modules -* Adding new public functionality to the [api_reference](https://www.aeon-toolkit.org/en/stable/api_reference.html) and [user_guide](https://www.aeon-toolkit.org/en/stable/getting_started.html). +- Documenting code using `numpydoc` docstring conventions +- Adding new public functionality to the [api_reference](https://www.aeon-toolkit.org/en/stable/api_reference.html). -More detailed information on ``aeon``'s documentation format is provided below. +More detailed information on `aeon`'s documentation format is provided below. ## Docstring conventions -`aeon` uses the numpydoc Sphinx extension and follows [NumPy docstring format](https://numpydoc.readthedocs.io/en/latest/format.html). +`aeon` uses the `numpydoc` Sphinx extension and follows [NumPy docstring format](https://numpydoc.readthedocs.io/en/latest/format.html). -To ensure docstrings meet expectations, `aeon` uses a combination of validations built into `numpydoc`, `pydocstyle pre-commit` checks (set to the NumPy convention) and automated testing of docstring examples to ensure the code runs without error. However, the automated docstring validation in pydocstyle only covers basic formatting Passing these tests is necessary to meet the `aeon` docstring conventions, but is not sufficient for doing so. +To ensure docstrings meet expectations, `aeon` uses a combination of validations built +into `numpydoc` and `pydocstyle` `pre-commit` checks (set to the NumPy convention) and +automated testing of docstring examples to ensure the code runs without error. -To ensure docstrings meet aeon's conventions, developers are expected to check their docstrings against numpydoc and `aeon` conventions and [reviewer's guide](https://www.aeon-toolkit.org/en/stable/contributing/reviewer_guide.html) are expected to also focus feedback on docstring quality. +Beyond basic NumPy docstring formatting conventions, developers should aim to: -## ``aeon`` specific conventions +- Ensure all parameters (classes, functions, methods) and attributes (classes) are +documented completely and consistently +- Add a `See Also` section that references related `aeon` code as applicable +- Include citations to relevant sources in a `References` section +- Include an `Examples` section that demonstrates at least basic functionality all +public code +- The docstrings are rendered into `.rst` files and should be written taking this into +account. For example, two ` characters are required for a code block instead of the +one used in Markdown. -Beyond basic NumPy docstring formatting conventions, developers should focus on: +In many cases, a parameter, attribute return object, or error may be described in many +docstrings across aeon. To avoid confusion, developers should try to make sure their +docstrings are as consistent as possible to existing docstring descriptions. -- Ensuring all parameters (classes, functions, methods) and attributes (classes) are documented completely and consistently -- Including links to the relevant topics in the :ref:`glossary` or :ref:`user_guide` in the extended summary -- Including an `Examples` section that demonstrates at least basic functionality in all public code artifacts -- Adding a `See Also` section that references related `aeon` code artifacts as applicable -- Including citations to relevant sources in a `References` section - -In many cases a parameter, attribute return object, or error may be described in many docstrings across aeon. To avoid confusion, developers should make sure their docstrings are as similar as possible to existing docstring descriptions of the the same parameter, attribute, return object or error. - -Accordingly, `aeon` estimators and most other public code artifcations should generally include the following NumPy docstring convention sections: +`aeon` should generally include `numpydoc` section in the following order as applicable: 1. Summary 2. Extended Summary 3. Parameters 4. Attributes (classes only) -5. Returns or Yields (as applicable) -6. Raises (as applicable) -7. See Also (as applicable) -8. Notes (as applicable) -9. References (as applicable) +5. Returns/Yields (functions/methods only) +6. Raises (functions/methods only) +7. See Also +8. Notes +9. References 10. Examples -## Summary and extended summary +### Summary and Extended Summary + +The summary should be a single line, followed by a extended summary. The extended +summary should include a user-friendly explanation of the code functionality, +i.e. a short, user-friendly synopsis of the algorithm being implemented or a +high-level summary of the estimator components. + +### Parameters and Attributes + +All parameters and fitted attributes (any public attribute i.e. those set in fit and +ending with `_`) should be listed in the docstring. Each parameter and attribute +should include a description, type, and default value (if applicable). For example: -The summary should be a single line, followed by a (properly formatted) extended summary. The extended summary should include a user friendly explanation of the code artifacts functionality. +```clean +n_jobs : int, default=1 + The number of jobs to run in parallel for both ``fit`` and ``predict``. + ``-1`` means using all processors. +``` -For all `aeon` estimators and other code artifacts that implement an algorithm, the extended summary should include a short, user-friendly synopsis of the algorithm being implemented. When the algorithm is implemented using multiple `aeon` estimators, the synopsis should first provide a high-level summary of the estimator components (e.g. transformer1 is applied then a classifier). Additional user-friendly details of the algorithm should follow (e.g. describe how the transformation and classifier work). +Parameters without default values or attributes do not need to include a default value: -A developer can link to a particular area of the user guide by including an explicit cross-reference and following the steps for referencing in Sphinx (see the helpful description on [Sphinx cross-references](https://docs.readthedocs.io/en/stable/guides/cross-referencing-with-sphinx.html) posted by Read the Docs). Again developers are encouraged to add important content to the user guide and link to it if it does not already exist. +```clean +n_cases_ : int + Number of train instances in data passed to ``fit``. +``` ### See Also -This section should reference other `aeon` code artifcats related to the code artifact being documented by the docstring. Developers should use judgement in determining related code artifcats. For example, rather than listin all other performance metrics, a percentage error based performance metric -might only list other percentage error based performance metrics. Likewise, a distance based classifier might list other distance based classifiers but -not include other types of time series classifiers. +This section should reference other `aeon` code related to the code being documented. +For example, the Catch22 pipeline classifier may reference the Catch22 feature +transformation and the pipeline regressor. + +```clean +ContractableBOSS + Variant of the BOSS classifier. +WEASEL + SFA based pipeline extending from BOSS. +SFA + The Symbolic Fourier Approximation feature transformation used in BOSS. +``` ### Notes -The notes section can include several types of information, including: +The notes section can information which is useful but does not fit into the other +sections or extended summary. At the discretion of developers. Some examples are: -- Mathematical details of a code object or other important implementation details (using `..math` or `:math:` functionality) -- Links to alternative implementations of the code artifact that are external to `aeon` (e.g. the Java implementation of an `aeon` time series classifier) -- state changing methods (`aeon` estimator classes) +- Links to alternative implementations of the code that are external to `aeon` +- Links to code used or taken inspiration from (sometimes this is better in the +extended summary) +- Explanations of quirks or limitations of the code ### References -`aeon` estimators that implement a concrete algorithm should generally include citations to the original research article, textbook or other resource -that describes the algorithm. Other code artifacts can include references as warranted (for example, references to relevant papers are included in -aeon's performance metrics). +`aeon` estimators that implement a published algorithm should generally include +citations to the original article (including arxiv etc.). Other papers relevant to the +code such as evaluations or extensions can also be included. -This should be done by adding references into the references section of the docstring, and then typically linking to these in other parts of the docstring. +References must be included in the following format: -The references you intend to link to within the docstring should follow a very specific format to ensure they render correctly. See the example below. Note the space between the ".." and opening bracket, the space after the closing bracket, and how all the lines after the first line are aligned immediately with the opening bracket. Additional references should be added in exactly the same way, but the number enclosed in the bracket should be incremented. - -```{code-block} powershell +```clean .. [1] Some research article, link or other type of citation. Long references wrap onto multiple lines, but you need to indent them so they start aligned with opening bracket on first line. ``` -To link to the reference labeled as `[1]`, you use `[1]_`. For multiple contiguous references please follow the format `[1]_, [2]_`. This only works within the same docstring. Sometimes this is not rendered correctly if the "[1]_" link is preceded or followed by certain characters. If you run into this issue, try putting a space before and following the `[1]_` link. +The `.. [*]` must be included at the start of the reference for it to render correctly. +include whitespace for other lines as shown. The reference label should be incremented +by 1 for each new reference. + +To link to the reference labelled as `[1]`, you use `[1]_` elsewhere in the docstring. +For multiple contiguous references follow the format `[1]_, [2]_`. This only works +within the same docstring. Include whitespace between the reference label and other +text in the docstring. ### Examples -Most code artifacts in `aeon` should include an examples section. At a minimum this should include a single example that illustrates basic functionality. The examples should use either a built-in `aeon` dataset or other simple data (e.g. randomly generated data, etc) generated using an `aeon` dependency (e.g. NumPy, pandas, etc) and whereever possible only depend on `aeon` or its core dependencies. Examples should also be designed to run quickly where possible. For quick running code artifacts, additional examples can be included to illustrate the affect of different parameter settings. +Most public code in `aeon` should include an examples section. At a minimum, this should +include a single example that illustrates basic functionality. The examples should use +either a built-in `aeon` dataset or other simple data (e.g. randomly generated data) +where possible. Examples should also be designed to run quickly where possible. + +```python +>>> import numpy as np +>>> from aeon.distances import dtw_distance +>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) +>>> y = np.array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20]) +>>> dtw_distance(x, y) # 1D series +768.0 + +>>> x = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [0, 1, 0, 2, 0]]) +>>> y = np.array([[11, 12, 13, 14],[7, 8, 9, 20],[1, 3, 4, 5]] ) +>>> dtw_distance(x, y) # 2D series with 3 channels, unequal length +564.0 +``` -### Examples of Good `aeon` Docstrings +`>>>` is used to indicate a line of code. Lines can be continued with `...`, for +example, if you have a long import you may want to do: -Here are a few examples of `aeon` code artifacts with good documentation. +```python +>>> from aeon.classification.dictionary_based import ( +... BOSSEnsemble +... ) +``` -#### Estimators +## Examples of good `aeon` docstrings -[BOSSEnsemble](https://www.aeon-toolkit.org/en/latest/api_reference/auto_generated/aeon.classification.dictionary_based.BOSSEnsemble.html#aeon.classification.dictionary_based.BOSSEnsemble) +Here are a few examples of `aeon` code with good documentation. -#### Functions -[dtw_distance](https://www.aeon-toolkit.org/en/stable/api_reference/auto_generated/aeon.distances.dtw_distance.html) +### Estimators -[numpydoc](https://numpydoc.readthedocs.io/en/latest/index.html) +[BOSSEnsemble](https://www.aeon-toolkit.org/en/latest/api_reference/auto_generated/aeon.classification.dictionary_based.BOSSEnsemble.html#aeon.classification.dictionary_based.BOSSEnsemble) -[pydocstyle](http://www.pydocstyle.org/en/stable/) +### Functions -[ContractableBOSS](https://www.aeon-toolkit.org/en/latest/api_reference/auto_generated/aeon.classification.dictionary_based.ContractableBOSS.html#aeon.classification.dictionary_based.ContractableBOSS) +[dtw_distance](https://www.aeon-toolkit.org/en/stable/api_reference/auto_generated/aeon.distances.dtw_distance.html) -[MeanAbsoluteScaledError](https://www.aeon-toolkit.org/en/stable/api_reference/auto_generated/aeon.performance_metrics.forecasting.MeanAbsoluteScaledError.html) +## Documentation build -[sphinx](https://www.sphinx-doc.org/) +We use [sphinx](https://www.sphinx-doc.org/) to build our documentation and +[readthedocs](https://readthedocs.org/projects/aeon-toolkit/) to host it. You can find +our latest documentation [here](https://www.aeon-toolkit.org/en/latest/). -[readthedocs](https://readthedocs.org/projects/aeon-toolkit/) +The source files can be found in [`docs/`](https://github.com/aeon-toolkit/aeon/tree/main/docs/). +The main configuration file for sphinx is [`conf.py`](https://github.com/aeon-toolkit/aeon/blob/main/docs/conf.py) +and the main page is [`index.md`](https://github.com/aeon-toolkit/aeon/blob/main/docs/index.md). +To add new pages, you need to add a new `.md` (or `.rst`, but preferably Markdown) +file and include it in a `toctree` to include it in the sidebar. -## Documentation Build +To build the documentation locally, you need to install a few extra dependencies +listed in [pyproject.toml](https://github.com/aeon-toolkit/aeon/blob/main/pyproject.toml). -We use [sphinx](https://www.sphinx-doc.org/) to build our documentation and [readthedocs](https://readthedocs.org/projects/aeon-toolkit/) to host it. You can find our latest documentation [here](https://www.aeon-toolkit.org/en/latest/). +1. To install documentation dependencies from the root directory, run: -The source files can be found in [docs/](https://github.com/aeon-toolkit/aeon/tree/main/docs/). The main configuration file for sphinx is [conf.py](https://github.com/aeon-toolkit/aeon/blob/main/docs/conf.py) and the main page is [index.md](https://github.com/aeon-toolkit/aeon/blob/main/docs/index.md). To add new pages, you need to add a new `.rst` file and include it in the `index.md` file. +```powershell +pip install --editable .[docs] +``` -To build the documentation locally, you need to install a few extra dependencies listed in [pyproject.toml](https://github.com/aeon-toolkit/aeon/blob/main/pyproject.toml). -1. To install extra dependencies from the root directory, run: +2. Swap to the documentation directory: -```{code-block} powershell -pip install .[docs] +```powershell +cd docs ``` -2. To build the website locally, run: +3. To build the website locally, run: -```{code-block} powershell -cd docs +```powershell make html ``` +For Windows, instead use: -You may need to install pandoc to build the documentation locally. +```powershell +make.bat html +``` +This will generate HTML documentation in `docs/_build/html`. Repeat step 3 to +regenerate the files if you make any changes. diff --git a/docs/developer_guide/release.md b/docs/developer_guide/release.md index ac56a41756..13f0e41577 100644 --- a/docs/developer_guide/release.md +++ b/docs/developer_guide/release.md @@ -13,23 +13,25 @@ The release process is as follows, on high-level: 1. **Ensure deprecation actions are carried out.** Deprecation actions for a version should be marked by "version number" annotated - comments in the code. E.g., for the release 0.10.0, search for the string 0.10.0 in - the code and carry out described deprecation actions. Collect list of deprecation - actions, as they should go in the release notes. + comments in the code. E.g., for the release `0.10.0`, search for the string `0.10.0` + in the code and carry out described deprecation actions. PRs performign deprecation + actions should start with [DEP] and use the `deprecation` label. So they are put + under the "Deprecations" section in the release notes. -1. **Create a "release" pull request.** +2. **Create a "release" pull request.** Create a branch from main and PR named after the release version. This should make changes to the version numbers (root `__init__.py`, `README.md` and `pyproject.toml`) - and have complete release notes in the changelog webpage. + and have complete release notes in the [changelog](https://www.aeon-toolkit.org/en/latest/changelog.html) + webpage. See the [release notes](#release-notes) section for more details. 3. **Merge the "release" pull request.** This PR should ideally be the final PR made before the release with the exception of any necessary troubleshooting PRs. The PR and release notes should optimally be - reviewed by the core developers, then merged once tests pass. + reviewed by multiple core developers, then merged once tests pass. 4. **Create the GitHub release.** This release should create a new tag following the syntax v[MAJOR].[MINOR].[PATCH], - e.g., the string `v0.10.0` for version 0.10.0. The release name should similarly be + e.g., the string `v0.10.0` for version `0.10.0`. The release name should similarly be `aeon v0.10.0`. The GitHub release notes should contain only "hightlights", "new contributors" and "all contributors" sections, and otherwise link to the release notes in the changelog, following the pattern of current GitHub release notes. The @@ -41,7 +43,11 @@ Creation of the GitHub release trigger the `pypi` release workflow. 5. **Wait for the ``pypi`` release CI/CD to finish.** If tests fail due to sporadic unrelated failure, restart. If tests fail genuinely, - something went wrong in the above steps, investigate, fix, and repeat. + something went wrong in the above steps, investigate, fix, and repeat. If the bug + is known and sporadic (i.e. failure to read data from an external source), the release + workflow can be restarted. It is not necessary to create a new GitHub release, and + the workflow can be manually run from the GitHub Actions tab if more PRs are + required. 6. **Release workflow completion tasks.** Once the release workflow has passed, check `aeon` version on `pypi`, this should be @@ -59,9 +65,23 @@ Creation of the GitHub release trigger the `pypi` release workflow. ## Release notes -Generally, [release notes](https://www.aeon-toolkit.org/en/latest/changelog.html) should follow the general pattern of previous release notes, with sections: - -- Highlights -- Dependency changes, if any -- Deprecations/removals, if any. -- Auto generated PR and contributions sections. +Generally, [release notes](#changelog) should follow the general pattern of previous +release notes. The initial change notes can be generated by running [.github/utilities/changelog_generator.py](https://github.com/aeon-toolkit/aeon/blob/main/.github/utilities/changelog_generator.py). +Pull requests will be put into sections in the release notes based on the labels, +so ensure that the PRs are labelled correctly. + +After generating the initial release notes, make sure to: +- Add the release version and month/year of the release at the top of the release notes +- Add any important announcements at the top of the release notes if applicable +- Add release highlights section +- Tidy the auto-generated PR list, moving PRs as necessary so they are correctly +categorised. It may be easier to re-label PRs and regenerate the release notes. +- Ensure the contributors section is correct + +## Emergency release workflow + +If a release is required urgently, the release testing process can be expedited by +running the "Fast release" workflow. **This workflow should not be used under normal +circumstances**. Any issues with release testing should be addressed in the normal +release workflow if possible. Consult the core developers on Slack before running this +in any circumstance. diff --git a/docs/developer_guide/testing.md b/docs/developer_guide/testing.md index ea1a34e83a..94ccf07f89 100644 --- a/docs/developer_guide/testing.md +++ b/docs/developer_guide/testing.md @@ -1,20 +1,239 @@ # Testing framework -`aeon` uses `pytest` for testing interface compliance of estimators, and correctness of -code. This page gives an overview of the test frames, and introductions on how to add -tests, or how to extend the testing framework. +`aeon` uses `pytest` for testing interface compliance of estimators and correctness of +code. This page gives an overview of the test framework. -## Test module architecture +Unit tests should cover as much code as possible. This includes differing parameters +for functions and estimators, error handling, and edge cases. It is not enough to just +run the code, but to test that it behaves as expected through output and state checks +after. -`aeon` testing happens on three layers, roughly corresponding to the inheritance layers -of estimators. +## Writing `aeon` tests -* **package level**: testing interface compliance with the `BaseAeonEstimator` -specifications, in `tests/test_all_estimators.py` -* **module level**: testing interface compliance of concrete estimators with their base -class, for instance `classification/tests/test_all_classifiers.py` -* **low level**: testing individual functionality of estimators or other code, in -individual files in `tests` folders. +There are two main ways to test code in `aeon`. Through test files and general +estimator testing. -The `aeon` testing framework is under redesign. If you have questions, please ask -a developer or in Slack. +### Test files + +All files, functions, and classes can have corresponding test files in a corresponding +`tests` directory of the package. If `module.py` is the file to be tested, the test +file for it should be placed in `tests/test_module.py` in the same package as a +original file. + +``` +aeon/ + └── package/ + ├── __init__.py + ├── module.py + └── tests/ + ├── __init__.py + └── test_module.py +``` + +All unit tests should be placed in a `tests` directory with a filename starting with +`test_`. All test functions should start with `test_` to be discovered by `pytest` i.e. + +```python +def test_function(): + assert function() == expected_output +``` + +For estimators, testing of base class functionality should be avoided to prevent +duplication with the general testing. Avoid testing: +- Basic runs of `fit` and `predict` methods using simple/testing parameters +- Base class functionality such as the output of `get_params` and `set_params` or +converting input data to the correct format. +- Error handling for basic errors such as wrong input types or output shapes and types + +Do test: +- Specific functionality of the estimator such as parameters not seen in general testing +- Whether internal attributes are set correctly after fitting or output is as expected +- Edge cases and error handling for parameter values and more complex errors + +Test functions which require soft dependencies should be skipped if the dependencies +are not installed. This can be done using the `pytest.mark.skipif` decorator. See the +[dependencies page](#developer_guide/dependencies). + +### General estimator testing + +The [`testing` module](https://github.com/aeon-toolkit/aeon/tree/main/aeon/testing/) +contains generalised testing for any class which extends +[BaseAeonEstimator](base.BaseAeonEstimator). This will test the estimator against +a set of general checks to ensure it is compliant with the `aeon` API. This includes +checking that inherited methods perform as expected, such as that estimator can be +fitted on dummy data without issue. We will also perform a variety of checks on the +estimator i.e. for picking, whether it is non-deterministic and the state of parameters, +inputs and attributes after method calls. + +Estimators which inherit from other base classes i.e. `BaseClassifier` or +`BaseAnomalyDetector` will have additional tests run on them to ensure they +comply with the API of the learning task. The tags of the estimator will also impact +what is tested, such as `X_input_type` affecting the test data used and `cant_pickle` +skip tests which require pickling the estimator. + +There is no list of all tests that are run on an estimator, but the code for all checks +can be found in the [`estimator_checks` subpackage](https://github.com/aeon-toolkit/aeon/tree/main/aeon/testing/estimator_checking). +The general tests can be run using functions found in the [`testing` API page](#api_reference/testing)., +with the main function being `check_estimator`. This function will collect all +applicable tests from the various `_yield_*_checks.py` files and run them on the +estimator. + +Estimators which require soft dependencies will have all but a few tests checking +behaviour surrounding the soft dependency skipped if the dependency is not installed. + +#### Adding new checks to the general testing + +To add a new check to the general testing, you must place it in the correct file. +For example, if you want your check to be run on all estimators, it should be in the +`_yield_estimator_checks.py` file. Tests relating to soft dependency checking are in +`_yield_soft_depencency_checks.py`. Tests for all classification estimators are in +`_yield_classification_checks.py` and so on. + +There are multiple types of checks which can be added to the general testing, any +new check should be one of the following: +- Check for the estimator class +- Check for the estimator class where a datatype is required +- Check for the estimator instances +- Check for the estimator instances where a datatype is required +- Check for the estimator instances where all acceptable datatypes should be tested + +In most cases (including the `aeon` GitHub CI) the general testing will be run on all +parameter sets returned by the `_create_test_instance` method of the estimator. +Which can contain multiple objects with different parameter sets. Each estimator +will have a variety of datatypes for testing data available, this includes types such +as 3D numpy arrays and lists, as well as capabilities such as multivariate and unequal +length data. Running a check multiple times for each datatype can be expensive, so +decide whether it is necessary or if any datatype will suffice. + +After writing a new check, it then must be added to the `_yield_*_checks` function +at the top of the file. Place the check in the correct section of the function for +the type of check it is. `yield` the check using `partial`, making sure to include +any parameters for the check so that the output check does not require any input. i.e. + +```python +yield partial( + check_non_state_changing_method, # check function + estimator=estimator, # estimator i from estimator_instances + datatype=datatypes[i][0], # estimator i, test datatype 0 +) +``` +See the [`_yield_classification_checks`](https://github.com/aeon-toolkit/aeon/blob/main/aeon/testing/estimator_checking/_yield_classification_checks.py) +function for an example of this. + +#### Adding a new module to the general testing + +If you have a new module which requires general testing, you must add a new file +to the `estimator_checking` directory. This file should be called `_yield_*_checks.py` +where `*` is the name of the module. This file should follow the same structure as +the other files in the directory. + +Ensure the base class for the new module is in the [register](https://github.com/aeon-toolkit/aeon/blob/main/aeon/utils/base/_register.py), +and that it is included as a valid option in any relevant [tags](https://github.com/aeon-toolkit/aeon/blob/main/aeon/utils/tags/_tags.py). + +You will have to make sure that valid testing data and labels are available for your +estimator type in [`testing/testing_data.py`](https://github.com/aeon-toolkit/aeon/blob/main/aeon/testing/testing_data.py). +Add to the `FULL_TEST_DATA_DICT` dictionary and edit the functions at the +bottom as necessary to accommodate your module. + +Some testing utilities may also need to be edited depending on the module structure, +i.e. [`_run_estimator_method`](https://github.com/aeon-toolkit/aeon/blob/main/aeon/testing/utils/estimator_checks.py) +may need to be edited to accommodate different method parameter names. + +#### Excluding tests and estimators + +Tests and estimators can be completely excluded from the general testing by adding them +to the `EXCLUDED_ESTIMATORS` and `EXCLUDED_TESTS` lists in the +[`testing/testing_config.py`](https://github.com/aeon-toolkit/aeon/blob/main/aeon/testing/testing_config.py) +file. `EXCLUDED_ESTIMATORS` only requires the estimator class name to skip all tests, +while `EXCLUDED_TESTS` requires the class name and test names in a list. + +These skips are only intended as a temporary measure, the issue causing the skip should +be fixed eventually and the item removed. If the issue cannot be resolved in the +estimators themselves, (i.e. `predict` must update the state for the task, which +is not normally allowed) the testing itself should be updated and the items removed +from the exclusion lists. + +## Testing in `aeon` CI + +`aeon` uses GitHub Actions for continuous integration testing. + +The `aeon` periodic test workflow runs once every day (except for manual runs) and will +run all possible tests. This includes running all test files and general estimator +testing on all estimators. The CI will also check for code formatting and linting, +as well as test coverage. + +The `aeon` PR testing workflow runs on every PR to the main branch. By default, this +will run a constrained set of tests excluding some tests such as those which +are noticeably expensive or prone to failure (i.e. I/O from external sources). +The estimators run will also be split into smaller subsets to spread them over +different Python version and operating system combinations. This is controlled by the +`PR_TESTING` flag in [`testing/testing_config.py`](https://github.com/aeon-toolkit/aeon/blob/main/aeon/testing/testing_config.py). + +A large portion of testing time is spent compiling `numba` functions. By default, +pull request workflows will use a cached set of functions generated from the +periodic test. The cached functions for any changed files will be invalidated. + +There are a number of labels which can be added to a PR to control the testing. These +are: +- `codecov actions` - Run the codecov action to update the test coverage +- `full examples run` - Run the full examples in the documentation +- `full pre-commit` - Run the pre-commit checks on all files +- `full pytest actions` - Run all tests in the CI, disable PR_TESTING +- `no numba cache` - Disable the GitHub `numba` cache for tests +- `run typecheck test` - Run the `mypy` typecheck workflow + +The periodic tests will run all of the above. + +## Running unit tests locally using `pytest` + + +To check if your code passes all tests locally, you need to install the development +version of `aeon` and all extra dependencies. See the [developer installation guide](#developer_guide/dev_installation) +for more information. + +To run all unit tests, run: + +```{code-block} powershell +pytest aeon/ +``` + +All regular `pytest` configuration is applicable here. See their [documentation](https://docs.pytest.org/en/stable/index.html) +for more information. + +The `-k` option can be used to run tests with a specific keyword i.e. to run all tests +containing `DummyClassifier`: + +```{code-block} powershell +pytest aeon/ -k DummyClassifier +``` + +All general tests will contain the estimator name in the test name, so this can be +used to run tests for a specific estimator. This will also work for test/check names, +or a combination to run a specific check on a specific estimator. + +The `pytest-xdist` dependency allows for parallel testing. To run tests in parallel +on all available cores, run: + +```{code-block} powershell +pytest aeon/ -n auto +``` + +Alternatively, input a number to run on that many cores i.e. `-n 4` to run on 4 cores. + +`aeon` also has some custom configuration options in its [conftest.py` file](https://github.com/aeon-toolkit/aeon/blob/main/conftest.py). +There are: +- `--nonumba` - Disable `numba` compilation if true +- `--enablethreading` - Skip setting various threading options to 1 prior to tests if true +- `--prtesting` - Set the PR_TESTING flag + +## Tracking test coverage + +We use [coverage](https://coverage.readthedocs.io/), the [pytest-cov](https://github.com/pytest-dev/pytest-cov) +plugin, and [codecov](https://codecov.io) for test coverage. Tes coverage can be found +on the [`aeon` codecov page](https://app.codecov.io/gh/aeon-toolkit/aeon). + +Workflows which generate coverage reports will have `numba` `njit` functions disabled. +This is mainly because the coverage of these functions cannot be accurately measured. +`numba` functions are also prone to accidental errors such as out-of-bounds array +access, which will not raise an error. As such, we use these workflows as an additional +check for bugs in the codebase. diff --git a/pyproject.toml b/pyproject.toml index eeffc7a4f6..7a843440bf 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -56,6 +56,7 @@ dependencies = [ ] [project.optional-dependencies] +# soft dependencies all_extras = [ "esig>=0.9.7; platform_system != 'Darwin' and python_version < '3.11'", "imbalanced-learn", @@ -83,6 +84,8 @@ unstable_extras = [ "mrsqm>=0.0.7,<0.1.0; platform_system != 'Windows' and python_version < '3.12'", # requires gcc and fftw to be installed for Windows and some other OS (see http://www.fftw.org/index.html) "mrseql>=0.0.4,<0.1.0; platform_system != 'Windows' and python_version < '3.12'", # requires gcc and fftw to be installed for Windows and some other OS (see http://www.fftw.org/index.html) ] + +# development dependencies dev = [ "backoff", "httpx",