-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StatsForecast models producing NotImplementedError: tiny datasets in 0.4.0 #238
Comments
Hey. Without an example it's hard to tell. Are you using aggregate? #189 was fixed in 0.4.0, so you were maybe getting leading zeros giving your series some more samples, which is no longer the case. |
Hi @jmoralez , I inspected the train_agg dataframe (produced using the aggregate function) for 0.3.0 vs 0.4.0 I'm inspecting the result of this line: 0.3.0 This allows the script to fit the StatsForecast AutoETS model and execute reconciliation for train_agg 0.4.0 Should I aim to add back in the interpolated 'y'=0 values for the missing 'ds' values to replicate the 0.3.0 behavior for model.fit()? Just want to ensure this is the intended behavior for the aggregate function, before I implement a post-hoc fix |
The problem with aggregate was leading zeros, e.g. if one of your series started at 2018-01-01 and another one at 2019-01-01 the aggregate function would then add all of 2018 as 0 for the second one. The fact that you have gaps in your series is a different problem and you should address it first (before running aggregate), you can use the fill_gaps function for that. |
Thanks! The fill_gaps function helped resolve this issue & successfully executed the full script. However, I did have to set fill_gaps(df,freq='MS',start='global'), which reintroduces the leading zeros problem you're referencing for late-start series. I tried leaving the start param at its default (start=‘per_serie’), but this still generated the NotImplementedError: tiny datasets. Looking at statsforecast/ets.py where this error is tracing, I believe it may be a problem specific to my dataset: I have sub-series in the hierarchy with too few data points (without adding in leading zeros). Since I am trying to fit AutoETS(model='AAA') onto all series, the (npars + 4) term is greater than n=len(y), which is raising the "tiny datasets" error. Therefore, I believe this issue can be closed, since it's specific to a modeling approach vs. a bug in the code. Thanks for your help! Incidentally, are there any plans to implement a MinTraceSparse(nonnegative=True) method in the future? I can handle negative values post-reconciliation, just curious about the roadmap. |
Thanks. Can you please open a new issue requesting the nonnegative sparse MinTrace? |
What happened + What you expected to happen
Carryover from: #234
I upgraded my installation to 0.4.0. However, upon running my script (no code changed), I am now getting the below error. This seems to be inducing an error in using StatsForecast AutoETS models now. I also tried StatsForecast HoltWinters and received the same error.
This error did not raise with the same dataset & script when running 0.3.0 - any ideas what might have changed the behavior from 0.3.0 to 0.4.0?
NotImplementedError Traceback (most recent call last)
Cell In[8], line 109
98 #valid_agg_reset = valid_agg.reset_index()
100 model = StatsForecast(models=[
102 AutoETS(season_length=12,model='AAA',alias='AutoETS_AAA')
(...)
107 ],
108 freq='MS', n_jobs=1, verbose=True)
--> 109 model.fit(train_agg)
111 p = model.forecast(h=h_months, fitted=True)
112 p_fitted = model.forecast_fitted_values()
File ~/lib/python3.10/site-packages/statsforecast/core.py:880, in StatsForecast.fit(self, df, sort_df, prediction_intervals)
878 self.prepare_fit(df, sort_df)
879 if self.n_jobs == 1:
--> 880 self.fitted = self.ga.fit(models=self.models)
881 else:
882 self.fitted = self._fit_parallel()
File ~/lib/python3.10/site-packages/statsforecast/core.py:77, in GroupedArray.fit(self, models)
75 for i_model, model in enumerate(models):
76 new_model = model.new()
---> 77 fm[i, i_model] = new_model.fit(y=y, X=X)
78 return fm
File ~/lib/python3.10/site-packages/statsforecast/models.py:650, in AutoETS.fit(self, y, X)
628 def fit(
629 self,
630 y: np.ndarray,
631 X: Optional[np.ndarray] = None,
632 ):
633 """Fit the Exponential Smoothing model.
634
635 Fit an Exponential Smoothing model to a time series (numpy array)
y
(...)
648 Exponential Smoothing fitted model.
649 """
--> 650 self.model_ = ets_f(
651 y, m=self.season_length, model=self.model, damped=self.damped
652 )
653 self.model_["actual_residuals"] = y - self.model_["fitted"]
654 self._store_cs(y=y, X=X)
File ~/lib/python3.10/site-packages/statsforecast/ets.py:1241, in ets_f(y, m, model, damped, alpha, beta, gamma, phi, additive_only, blambda, biasadj, lower, upper, opt_crit, nmse, bounds, ic, restrict, allow_multiplicative_trend, use_initial_values, maxit)
1238 # ses for non-optimized tiny datasets
1239 if n <= npars + 4:
1240 # we need HoltWintersZZ function
-> 1241 raise NotImplementedError("tiny datasets")
1242 # fit model (assuming only one nonseasonal model)
1243 if errortype == "Z":
NotImplementedError: tiny datasets
Versions / Dependencies
dateutil 2.8.2
hierarchicalforecast 0.4.0
matplotlib 3.7.1
numpy 1.23.5
pandas 2.0.2
session_info 1.0.0
statsforecast 1.6.0
IPython 8.14.0
jupyter_client 8.2.0
jupyter_core 5.3.0
notebook 6.5.4
Python 3.10.11 (main, Apr 20 2023, 19:02:41) [GCC 11.2.0]
Linux-4.18.0-372.16.1.0.1.el8_6.x86_64-x86_64-with-glibc2.35
Reproduction script
model = StatsForecast(models=[AutoETS(season_length=12,model='AAA',alias='AutoETS_AAA') ], freq='MS', n_jobs=1, verbose=True)
model.fit(train_agg)
The call to model.fit generates the NotImplementedError: tiny datasets from statsforecast/ets.py
The same code executes successfully when running version 0.3.0, instead of 0.4.0
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: