[DOC] Rework getting started guide and single problem forecasting loaders #2248

TonyBagnall · 2024-10-25T08:31:35Z

fixes #2246 part of #1518

This has expanded to tidying up the single problem data loaders for forecasting. So its in two related parts

datasets._single_problem_loaders

There are seven baked in forecasting data sets and were eight loaders.

I have removed load_macroeconomic because it was just a wrapper for the statsmodels loader
For five univariate series loaders, I have added a return_array boolean argument that defaults to true. This makes it load data as an np.ndarray, if false it returns a pd.Series
There are two multivariate loaders, uschange and longley. These adopted a structure like this

y, X = load_longley(y_series = "Consumption")

firstly returning y, X is opposite to collections, and secondly there seems no need to split the data in the loader, the user can surely do that themselves. Changed to

data =load_longley()

returns numpy array with axis == 0, i.e. n_channels, n_timepoints

and

data =load_longley(return_array = False)

returns a data frame with axis == 1 and all the column names set as before.

Read me

Split along series/collection estimators and adding an example for each module, including experimental. It makes it longer, maybe we dont want it, but its a good top level intro imo, will link for further details

First version done, highlighted there is no anomaly detection notebook, see #1960 and the transformers notebooks need an overhaul, but thats future work. The main goal is to get things ready for the new forecasting base class

aeon-actions-bot · 2024-10-25T08:32:08Z

Thank you for contributing to `aeon`

I did not find any labels to add that did not already exist. If the content of your PR changes, make sure to update the labels accordingly.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

Run pre-commit checks for all files
Run mypy typecheck tests
Run all pytest tests and configurations
Run all notebook example tests
Run numba-disabled codecov tests
Stop automatic pre-commit fixes (always disabled for drafts)
Disable numba cache loading
Push an empty commit to re-run CI checks

review-notebook-app · 2024-10-28T20:11:06Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

…/aeon into ajb/getting_started

MatthewMiddlehurst · 2024-10-30T12:45:09Z

we are defaulting, but it is not universal, see anomaly detection.

Where in anomaly detection? I am referring just to inputs, dont really care what format they want the data internally

You cant load into a dataframe with n_channels, n_timepoints and keep the column names. I have left the pandas stuff for legacy reasons really. I think I would rather remove these loaders completely than process to have series in rows in a dataframe.

cant you just transpose the dataframe? seems very odd if that removes the indicies

TonyBagnall · 2024-10-30T12:54:55Z

we are defaulting, but it is not universal, see anomaly detection.

Where in anomaly detection? I am referring just to inputs, dont really care what format they want the data internally

You cant load into a dataframe with n_channels, n_timepoints and keep the column names. I have left the pandas stuff for legacy reasons really. I think I would rather remove these loaders completely than process to have series in rows in a dataframe.

cant you just transpose the dataframe? seems very odd if that removes the indicies

I guess I can, seems odd to do so and I thought the examples made it clear, but sure, Need to rewrite the tests that use col_names

MatthewMiddlehurst · 2024-10-30T13:01:25Z

I remember some previous changes where we tried to remove n_timepoints, n_channels as much as we could. Think it was df-list and the other dataframe one for collections. May have been some other changes removing references to it as well.

Seems better to completely follow one format and leave changing that to the axis stuff if thats how we want it

TonyBagnall · 2024-10-30T13:03:07Z

we are defaulting, but it is not universal, see anomaly detection.

Where in anomaly detection? I am referring just to inputs, dont really care what format they want the data internally

You cant load into a dataframe with n_channels, n_timepoints and keep the column names. I have left the pandas stuff for legacy reasons really. I think I would rather remove these loaders completely than process to have series in rows in a dataframe.

cant you just transpose the dataframe? seems very odd if that removes the indicies

yes but you cannot then extract series X["channel_name"], anyway, I have done it and removed the test for column names

MatthewMiddlehurst · 2024-10-30T13:08:58Z

You would just use X.loc["channel_name"] im pretty sure

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html

TonyBagnall · 2024-10-31T08:34:17Z

You would just use X.loc["channel_name"] im pretty sure

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html

It was more I didn't want to rewrite the notebooks that used plot_series. But I've ditched that now

hadifawaz1999

just small questions, if you want to keep them for later i dont mind

docs/getting_started.md

review-notebook-app · 2024-11-03T12:51:54Z

View / edit / reply to this conversation on ReviewNB

hadifawaz1999 commented on 2024-11-03T12:51:53Z
----------------------------------------------------------------

i would think its better to add a section per task to use the load any dataset function, load_classification load_regression etc. what do u think ?

TonyBagnall commented on 2024-11-03T16:46:53Z
----------------------------------------------------------------

yes I agree, but maybe not in this PR? Really only wanted to do the getting_started.md, then will work through notebooks module by module, starting with datasets

TonyBagnall · 2024-11-03T16:46:54Z

yes I agree, but maybe not in this PR? Really only wanted to do the getting_started.md, then will work through notebooks module by module, starting with datasets

View entire conversation on ReviewNB

hadifawaz1999

lgtm

getting started

f1dc8b5

TonyBagnall added the documentation Improvements or additions to documentation label Oct 25, 2024

TonyBagnall and others added 6 commits October 25, 2024 20:44

getting started

327746d

Merge branch 'main' into ajb/getting_started

71a2416

forecasting loaders

8208dd0

rework getting started

4ac9ee6

testing with airline

d4e801d

notebook

63becd5

TonyBagnall added 2 commits October 28, 2024 21:23

load_airline

ddd240f

remove now redundant test

109315b

TonyBagnall changed the title ~~[DOC] Rework getting started guide~~ [DOC] Rework getting started guide and single problem forecasting loaders Oct 29, 2024

refactor load_airlines to numpy

6cbaa0d

TonyBagnall mentioned this pull request Oct 29, 2024

[ENH] Forecasting first version #2266

Closed

4 tasks

TonyBagnall and others added 3 commits October 29, 2024 09:51

notebook

5912e96

getting_started.md

52956be

Merge branch 'main' into ajb/getting_started

8e632d5

aeon-actions-bot bot added the full examples run Run all examples on a PR label Oct 29, 2024

TonyBagnall added 6 commits October 29, 2024 13:35

getting started

b2151ee

Merge branch 'ajb/getting_started' of https://github.com/aeon-toolkit…

985dea6

…/aeon into ajb/getting_started

getting started segmentation

acf1ca2

format links

5fd7146

format links

46b1a48

format links

d5c71d7

TonyBagnall marked this pull request as ready for review October 29, 2024 17:34

TonyBagnall requested review from baraline, MatthewMiddlehurst and SebastianSchmidl as code owners October 29, 2024 17:34

Merge branch 'main' into ajb/getting_started

22dc25e

Merge branch 'main' into ajb/getting_started

2c36e6e

getting started

8a1fccc

getting started

ebe58cc

TonyBagnall requested review from chrisholder, MatthewMiddlehurst and SebastianSchmidl October 30, 2024 13:03

TonyBagnall added 3 commits October 30, 2024 14:50

notebook

895daad

dobin

bd54e46

solar

c11651b

TonyBagnall added 4 commits October 31, 2024 10:46

Merge branch 'main' into ajb/getting_started

d5ed582

distances

a7195a6

distances

4686bd4

distances

f4be506

TonyBagnall mentioned this pull request Oct 31, 2024

[ENH] Rework series plotting to work with numpy arrays #2279

Merged

Merge branch 'main' into ajb/getting_started

b9c4396

hadifawaz1999 requested changes Nov 3, 2024

View reviewed changes

docs/getting_started.md Outdated Show resolved Hide resolved

docs/getting_started.md Outdated Show resolved Hide resolved

getting started

4949df5

TonyBagnall requested a review from hadifawaz1999 November 3, 2024 16:59

hadifawaz1999 approved these changes Nov 4, 2024

View reviewed changes

SebastianSchmidl approved these changes Nov 4, 2024

View reviewed changes

TonyBagnall merged commit 3992bc7 into main Nov 4, 2024
15 checks passed

TonyBagnall deleted the ajb/getting_started branch November 4, 2024 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC] Rework getting started guide and single problem forecasting loaders #2248

[DOC] Rework getting started guide and single problem forecasting loaders #2248

TonyBagnall commented Oct 25, 2024 •

edited

Loading

aeon-actions-bot bot commented Oct 25, 2024 •

edited by TonyBagnall

Loading

review-notebook-app bot commented Oct 28, 2024

MatthewMiddlehurst commented Oct 30, 2024

TonyBagnall commented Oct 30, 2024

MatthewMiddlehurst commented Oct 30, 2024

TonyBagnall commented Oct 30, 2024

MatthewMiddlehurst commented Oct 30, 2024

TonyBagnall commented Oct 31, 2024

hadifawaz1999 left a comment

review-notebook-app bot commented Nov 3, 2024 •

edited

Loading

TonyBagnall commented Nov 3, 2024

hadifawaz1999 left a comment

[DOC] Rework getting started guide and single problem forecasting loaders #2248

[DOC] Rework getting started guide and single problem forecasting loaders #2248

Conversation

TonyBagnall commented Oct 25, 2024 • edited Loading

datasets._single_problem_loaders

Read me

aeon-actions-bot bot commented Oct 25, 2024 • edited by TonyBagnall Loading

Thank you for contributing to aeon

PR CI actions

review-notebook-app bot commented Oct 28, 2024

MatthewMiddlehurst commented Oct 30, 2024

TonyBagnall commented Oct 30, 2024

MatthewMiddlehurst commented Oct 30, 2024

TonyBagnall commented Oct 30, 2024

MatthewMiddlehurst commented Oct 30, 2024

TonyBagnall commented Oct 31, 2024

hadifawaz1999 left a comment

Choose a reason for hiding this comment

review-notebook-app bot commented Nov 3, 2024 • edited Loading

TonyBagnall commented Nov 3, 2024

hadifawaz1999 left a comment

Choose a reason for hiding this comment

TonyBagnall commented Oct 25, 2024 •

edited

Loading

aeon-actions-bot bot commented Oct 25, 2024 •

edited by TonyBagnall

Loading

Thank you for contributing to `aeon`

review-notebook-app bot commented Nov 3, 2024 •

edited

Loading