We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aggregate
The aggregate function adds leading zeros to datasets with different dates per time series. Here's a minimal example:
import pandas as pd import statsforecast.models as sfm import hierarchicalforecast.methods as hfm from statsforecast.utils import generate_series from statsforecast import StatsForecast from hierarchicalforecast.utils import aggregate from hierarchicalforecast.core import HierarchicalReconciliation max_tenure = 24 dates = pd.date_range(start='2019-01-31', freq='M', periods=max_tenure) cohort_tenure = [24, 23, 22, 21] ts_list = [] # Create ts for each cohort for i in range(len(cohort_tenure)): ts_list.append( generate_series(n_series=1, freq='M', min_length=cohort_tenure[i], max_length=cohort_tenure[i]).reset_index() \ .assign(ult=i) \ .assign(ds=dates[-cohort_tenure[i]:]) \ .drop(columns=['unique_id']) ) df = pd.concat(ts_list, ignore_index=True) # Create categories df.loc[df['ult'] < 2, 'pen'] = 'a' df.loc[df['ult'] >= 2, 'pen'] = 'b' # Note that unique id requires strings df['ult'] = df['ult'].astype(str) hier_levels = [ ['pen'], ['pen', 'ult'], ] hier_df, S_df, tags = aggregate(df=df, spec=hier_levels) hier_df = hier_df.reset_index() # .query("unique_id.str.split('/').str[0] <= ds.dt.strftime('%Y-%m')") print('S_df.shape', S_df.shape) print('hier_df.shape', hier_df.shape)
If you query the 3rd cohort, we should see dates starting with 2019-03-31
df.query("ult == '2'")
But if you query hier_df, the output of aggregate, you'll see dates starting from 2019-01-31, the earliest date in the dataset.
hier_df
hier_df.query("unique_id.str.split('/').str[-1] == '2'")
If you remove the leading zero's, reconcile fails because forecast_fitted_values cannot be reshaped into length of `S_df'.
reconcile
forecast_fitted_values
The text was updated successfully, but these errors were encountered:
AzulGarza
Successfully merging a pull request may close this issue.
The
aggregate
function adds leading zeros to datasets with different dates per time series. Here's a minimal example:If you query the 3rd cohort, we should see dates starting with 2019-03-31
But if you query
hier_df
, the output of aggregate, you'll see dates starting from 2019-01-31, the earliest date in the dataset.If you remove the leading zero's,
reconcile
fails becauseforecast_fitted_values
cannot be reshaped into length of `S_df'.The text was updated successfully, but these errors were encountered: