Function Reference
hierTS is a lightweight package. We expose a compact set of functions to perform hierarchical forecast reconciliation. These functions can be imported from reconciliation. For example:
from hierts.reconciliation import hierarchy_cross_sectional
- hierts.reconciliation.aggregate_bottom_up_forecasts(forecasts: DataFrame, df_S: DataFrame, name_bottom_timeseries: str = 'bottom_timeseries') DataFrame[source]
Aggregate a set of bottom-level forecasts according to a specified summing matrix df_S
- Parameters:
forecasts (pd.DataFrame) – dataframe containing bottom-level forecasts
df_S (pd.DataFrame) – Dataframe containing the summing matrix for all aggregations in the hierarchy.
name_bottom_timeseries (str) – name for the bottom level time series in the hierarchy, defaults to ‘bottom_timeseries’.
- Returns:
forecasts_methods, dataframe containing forecasts for all reconciliation methods
- Return type:
pd.DataFrame
- hierts.reconciliation.apply_reconciliation_methods(forecasts: DataFrame, df_S: DataFrame, y_train: DataFrame, yhat_train: DataFrame, methods: Optional[List[str]] = None, positive: bool = False, return_timing: bool = False) DataFrame[source]
Apply all hierarchical forecasting reconciliation methods to a set of forecasts.
- Parameters:
forecasts (pd.DataFrame) – dataframe containing forecasts for all aggregations
df_S (pd.DataFrame) – Dataframe containing the summing matrix for all aggregations in the hierarchy.
y_train (pd.DataFrame) – dataframe containing the ground truth on the training set for all timeseries.
yhat_train (pd.DataFrame) – dataframe containing the forecasts on the training set for all timeseries.
methods (List[str]) – list containing which reconciliation methods to be applied, defaults to None. Choose from: ‘ols’, ‘wls_var’, ‘wls_struct’, ‘mint_cov’, ‘mint_shrink’, ‘erm’, ‘erm_reg’, ‘erm_bu’. None means all methods will be applied.
positive (bool, optional) – Boolean to enforce reconciled forecasts are >= zero, defaults to False.
return_timing (bool, optional) – Flag to return execution time for reconciliation methods
- Returns:
forecasts_methods, dataframe containing forecasts for all reconciliation methods
- Return type:
pd.DataFrame
- hierts.reconciliation.calc_level_method_error(forecasts_methods: DataFrame, actuals: DataFrame, metric: str = 'RMSE') DataFrame[source]
Calculate RMSE for each level, for each method for a set of forecasts.
- Parameters:
forecasts_methods (pd.DataFrame) – dataframe containing forecasts for all reconciliation methods
actuals (pd.DataFrame) – Dataframe containing the ground truth for all time series
metric (str) – metric to compute. Options are: [‘RMSE’, ‘MAE’]
- Returns:
Error for all methods, across all levels.
- Return type:
pd.DataFrame
- hierts.reconciliation.calc_level_method_rmse(forecasts_methods: DataFrame, actuals: DataFrame, base: str = 'base') Tuple[DataFrame, DataFrame][source]
Calculate RMSE for each level, for each method for a set of forecasts.
- Parameters:
forecasts_methods (pd.DataFrame) – dataframe containing forecasts for all reconciliation methods
actuals (pd.DataFrame) – Dataframe containing the ground truth for all time series
base (str) – base to compare rmse against for the rel_rmse output.
- Returns:
tuple containing (i) rmse for all methods, across all levels, and (ii) relative rmse for all methods, across all levels.
- Return type:
Tuple[pd.DataFrame, pd.DataFrame]
- hierts.reconciliation.calc_summing_matrix(df: DataFrame, aggregation_cols: List[str], aggregations: Optional[List[List[str]]] = None, sparse: bool = False, name_bottom_timeseries: str = 'bottom_timeseries') DataFrame[source]
Given a dataframe of timeseries and columns indicating their groupings, this function calculates a cross-sectional hierarchy according to a set of specified aggregations for the time series. This function is deprecated, please use ‘hierarchy_cross_sectional’ instead.
- Parameters:
df (pd.DataFrame) – DataFrame containing information about time series and their groupings
aggregation_cols (List[str]) – List containing all the columns that contain categorization of the timeseries.
aggregations (List[List[str]]) – List of Lists containing the aggregations required, defaults to None. In case of None, the summing matrix will only contain (i) the summation vector for the total series (i.e. a row vector of ones of length n_bottom_series), and (ii) the summation matrix for the bottom level series (i.e. the identity matrix for the amount of bottom level time series). Hence, in the case of None, the output df_S will have shape [n_bottom_series + 1, n_bottom_series]
sparse (bool) – Boolean to indicate whether the returned summing matrix should be backed by a SparseArray (True) or a regular Numpy array (False), defaults to False.
name_bottom_timeseries (str) – name for the bottom level time series in the hierarchy, defaults to ‘bottom_timeseries’.
- Returns:
df_S, output dataframe containing the summing matrix of shape [(n_bottom_timeseries + n_aggregate_timeseries) x n_bottom_timeseries]. The number of aggregate time series is the result of applying all the required aggregations.
- Return type:
pd.DataFrame filled with np.float32
- hierts.reconciliation.hierarchy_cross_sectional(df: DataFrame, aggregations: List[List[str]], sparse: bool = False, name_bottom_timeseries: str = 'bottom_timeseries') DataFrame[source]
Given a dataframe of timeseries and columns indicating their groupings, this function calculates a cross-sectional hierarchy according to a set of specified aggregations for the time series.
- Parameters:
df (pd.DataFrame) – DataFrame containing information about time series and their groupings
aggregations (List[List[str]]) – List of Lists containing the aggregations required.
sparse (bool) – Boolean to indicate whether the returned summing matrix should be backed by a SparseArray (True) or a regular Numpy array (False), defaults to False.
name_bottom_timeseries (str) – name for the bottom level time series in the hierarchy, defaults to ‘bottom_timeseries’.
- Returns:
df_S, output dataframe containing the summing matrix of shape [(n_bottom_timeseries + n_aggregate_timeseries) x n_bottom_timeseries]. The number of aggregate time series is the result of applying all the required aggregations.
- Return type:
pd.DataFrame filled with np.float32
- hierts.reconciliation.hierarchy_temporal(df: DataFrame, time_column: str, aggregations: List[List[str]], sparse: bool = False) DataFrame[source]
Given a dataframe of timeseries and a time_column indicating the timestamp of each series, this function calculates a temporal hierarchy according to a set of specified aggregations for the time series.
- Parameters:
df (pd.DataFrame) – DataFrame containing information about time series and their groupings
time_column (str) – String containing the column name that contains the time column of the timeseries
aggregations (List[List[str]]) – List of Lists containing the aggregations required.
sparse (bool) – Boolean to indicate whether the returned summing matrix should be backed by a SparseArray (True) or a regular Numpy array (False), defaults to False.
- Returns:
df_S, output dataframe containing a summing matrix of shape [n_timesteps x (n_timesteps + n_aggregate_timesteps)]. The number of aggregate timesteps is the result of applying all the required temporal aggregations.
- Return type:
pd.DataFrame filled with np.float32
- hierts.reconciliation.reconcile_forecasts(yhat: ndarray, S: ndarray, y_train: Optional[ndarray] = None, yhat_train: Optional[ndarray] = None, method: str = 'ols', positive: bool = False) ndarray[source]
Optimal reconciliation of hierarchical forecasts using various approaches.
Based on approaches from:
[‘ols’, ‘wls_var’, ‘wls_struct’, ‘mint_cov’, ‘mint_shrink’] Wickramasuriya, S. L., Athanasopoulos, G., & Hyndman, R. J. (2019). Optimal forecast reconciliation for hierarchical and grouped time series through trace minimization. Journal of the American Statistical Association, 114(526), 804-819.
[‘erm’, ‘erm_reg’, ‘erm_bu’] Ben Taieb, Souhaib, and Bonsoo Koo. ‘Regularized Regression for Hierarchical Forecasting Without Unbiasedness Conditions’. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 1337–47. Anchorage AK USA: ACM, 2019. https://doi.org/10.1145/3292500.3330976.
- Parameters:
yhat_test (numpy.ndarray) – out-of-sample forecasts for each time series for each timestep of size [n_timeseries x n_timesteps]. These forecasts will be reconciled according to the hierarchy specified by S.
S (numpy.ndarray) – summing matrix detailing the hierarchical tree of size [n_timeseries x n_bottom_timeseries]
y_train (numpy.ndarray, optional) – ground truth for each time series for a set of historical timesteps of size [n_timeseries x n_timesteps_train]. Required when using ‘wls_var’, ‘mint_cov’, ‘mint_shrink’, ‘erm’, ‘erm_reg’, ‘erm_bu’
yhat_train (numpy.ndarray, optional) – forecasts for each time series for a set of historical timesteps of size [n_timeseries x n_timesteps_residuals]. Required when using ‘wls_var’, ‘mint_cov’, ‘mint_shrink’, ‘erm’, ‘erm_reg’, ‘erm_bu’
method (str, optional) – reconciliation method, defaults to ‘ols’. Options are: ‘ols’, ‘wls_var’, ‘wls_struct’, ‘mint_cov’, ‘mint_shrink’, ‘erm’, ‘erm_reg’, ‘erm_bu’
positive (bool, optional) – Boolean to enforce reconciled forecasts are >= zero, defaults to False.
- Returns:
ytilde, reconciled forecasts for each time series for each timestep of size [n_timeseries x n_timesteps]
- Return type:
numpy.ndarray