Tourism

We are interested in producing hierarchical reconciled forecasts for the Tourism dataset.

This example is based on chapter 11 of:

Hyndman, R.J., & Athanasopoulos, G. (2021) Forecasting: principles and practice, 3rd edition, OTexts: Melbourne, Australia. OTexts.com/fpp3. Accessed on 05/07/2022.

The data was taken from R’s tsibble package:

Wang, E, D Cook, and RJ Hyndman (2020). A new tidy data structure to support exploration and modeling of temporal data, Journal of Computational and Graphical Statistics, 29:3, 466-478, doi:10.1080/10618600.2019.1695624.

https://tsibble.tidyverts.org/

Download the tourism.csv file at: https://github.com/elephaint/hierts/tree/main/examples/data

The code for this example can be found at:

Jupyter Notebook: https://github.com/elephaint/hierts/tree/main/docs/notebooks
Python script: https://github.com/elephaint/hierts/tree/main/examples

Import packages & read data

[4]:

#%% Read packages
import pandas as pd
import numpy as np
from hierts.reconciliation import calc_summing_matrix, apply_reconciliation_methods, \
                                  aggregate_bottom_up_forecasts, calc_level_method_rmse

[5]:

#%% Read data
df = pd.read_csv('../../examples/data/tourism.csv', index_col=0)
# We need to convert the Quarter column to a PeriodIndex in order to guarantee correct order
df['Quarter'] = pd.PeriodIndex(df['Quarter'].str[0:4] + '-' + df['Quarter'].str[5:], freq='q')

[6]:

# Let's look at the data
df.head(10)

[6]:

	Quarter	Region	State	Purpose	Trips
1	1998Q1	Adelaide	SA	Business	135.077690
2	1998Q2	Adelaide	SA	Business	109.987316
3	1998Q3	Adelaide	SA	Business	166.034687
4	1998Q4	Adelaide	SA	Business	127.160464
5	1999Q1	Adelaide	SA	Business	137.448533
6	1999Q2	Adelaide	SA	Business	199.912586
7	1999Q3	Adelaide	SA	Business	169.355090
8	1999Q4	Adelaide	SA	Business	134.357937
9	2000Q1	Adelaide	SA	Business	154.034398
10	2000Q2	Adelaide	SA	Business	168.776364

As you can see, we have a column indicating the time index (‘Quarter’), three columns that provide information on the hierarchy/aggregation of the data (‘State’, ‘Region’, ‘Purpose’), and a target column (‘Trips’).

Set aggregations and calculate summing matrix

For this dataset, the aggregations are in the columns: [‘State’, ‘Region’, ‘Purpose’]. These columns will be in our aggregations.

[7]:

aggregation_cols = ['State', 'Region', 'Purpose']

Next, we define the aggregations that we are interested in. In this case, we are interested in the following aggregations:

[8]:

aggregations = [['State'],
                ['State', 'Region'],
                ['State', 'Purpose'],
                ['Purpose']]

Don’t include the top (total) level and bottom-level: these will be added automatically later on. Now, we can create a summing matrix, which shows how each of the bottom-level series maps to our chosen aggregations:

[9]:

df_S = calc_summing_matrix(df, aggregation_cols, aggregations)

Let’s have a look at df_S:

[10]:

df_S

[10]:

	bottom_timeseries	ACT-Canberra-Business	ACT-Canberra-Holiday	ACT-Canberra-Other	ACT-Canberra-Visiting	NSW-Blue Mountains-Business	NSW-Blue Mountains-Holiday	NSW-Blue Mountains-Other	NSW-Blue Mountains-Visiting	NSW-Capital Country-Business	NSW-Capital Country-Holiday	...	WA-Australia's North West-Other	WA-Australia's North West-Visiting	WA-Australia's South West-Business	WA-Australia's South West-Holiday	WA-Australia's South West-Other	WA-Australia's South West-Visiting	WA-Experience Perth-Business	WA-Experience Perth-Holiday	WA-Experience Perth-Other	WA-Experience Perth-Visiting
Aggregation	Value
Total	Total	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	...	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0
State	ACT	1.0	1.0	1.0	1.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
	NSW	0.0	0.0	0.0	0.0	1.0	1.0	1.0	1.0	1.0	1.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
	NT	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
	QLD	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
bottom_timeseries	WA-Australia's South West-Visiting	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0	0.0
	WA-Experience Perth-Business	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0
	WA-Experience Perth-Holiday	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0
	WA-Experience Perth-Other	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0
	WA-Experience Perth-Visiting	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0

425 rows × 304 columns

As you can see, df_S is a mapping between the bottom-level series (columns of df_S) and the aggregations specified. Some observations:

‘Total’ contains the total across all bottom-level series. Hence, this is a row consisting of ones - the total is the sum of all the bottom-level series.
‘Bottom level’ contains the bottom level series. This is an identity matrix, as each bottom-level series in the rows only maps to a single bottom level series in the columns
Everything in between ‘Bottom level’ and ‘Total’ denotes how we can construct the aggregations. For example, the ‘State-ACT’ aggregation is formed by summing the first four bottom-level series ACT-Canberra-Business, ACT-Canberra-Holiday, ACT-Canberra-Other and ACT-Canberra-Visiting.

Create a forecasting model for each time series

Now we can create a forecasting model for each time series in the aggregation matrix df_S

[11]:

# Set target, time_index and split of train and test.
target = 'Trips'
time_index = 'Quarter'
end_train = '2015Q4'
start_test = '2016Q1'

[12]:

# Add bottom_timeseries identifier and create actuals dataframe for all aggregations
df['bottom_timeseries'] = df[aggregation_cols].agg('-'.join, axis=1)
actuals_bottom_timeseries = df.set_index(['bottom_timeseries', time_index])[target]\
                              .unstack(1)\
                              .loc[df_S.columns]
actuals = df_S @ actuals_bottom_timeseries

Let’s look at the actuals:

[13]:

actuals

[13]:

	Quarter	1998Q1	1998Q2	1998Q3	1998Q4	1999Q1	1999Q2	1999Q3	1999Q4	2000Q1	2000Q2	...	2015Q3	2015Q4	2016Q1	2016Q2	2016Q3	2016Q4	2017Q1	2017Q2	2017Q3	2017Q4
Aggregation	Value
Total	Total	23182.197269	20323.380067	19826.640511	20830.129891	22087.353380	21458.373285	19914.192508	20027.925640	22339.294779	19941.063482	...	23485.745655	25140.161222	26660.637689	24285.027757	24191.320131	26347.600973	27496.389021	26113.606708	26506.314708	27593.554214
State	ACT	551.001921	416.025623	436.029011	449.798445	378.572817	558.178142	448.901196	594.825442	599.668540	557.135057	...	688.203188	597.245569	625.141634	592.608499	572.437093	667.214141	634.368747	748.290431	631.759904	720.329370
	NSW	8039.794795	7166.013805	6747.935790	7282.082371	7584.776839	7054.038705	6723.677146	7135.766945	7295.986639	6445.220168	...	6961.250203	7760.019022	8287.872080	7660.579621	7326.304671	7995.502192	8320.703462	8285.026279	8298.257417	8542.490607
	NT	181.448823	313.936152	528.436859	247.702817	184.889592	366.092786	501.431284	248.430300	206.053220	359.761925	...	682.551824	423.202516	352.098496	469.136536	713.820750	340.695927	298.166026	621.441392	597.658397	346.061938
	QLD	4041.370159	3967.904653	4593.893991	4202.829141	4332.490850	4824.480452	5018.034504	4349.835877	4413.226234	4344.159277	...	5866.070451	5532.210644	5042.153750	5446.553651	5939.334645	6106.383853	5451.987128	5638.473535	6533.836932	5813.903667
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
bottom_timeseries	WA-Australia's South West-Visiting	116.970746	106.044650	102.641667	155.877388	134.385712	161.531084	107.090992	118.579005	110.625423	120.253570	...	119.421196	223.346491	267.934342	167.308435	150.425772	222.621753	254.044094	219.257038	188.483319	209.846009
	WA-Experience Perth-Business	187.985188	118.678907	129.135987	188.048404	164.311015	122.140258	148.933004	129.196928	100.691002	200.474436	...	217.173099	267.437376	220.232305	211.946738	143.052265	217.759483	154.757531	230.309344	282.529913	270.986641
	WA-Experience Perth-Holiday	294.678011	277.422409	250.539302	291.984508	338.295122	292.867993	274.020511	207.565280	344.740832	324.050128	...	283.733539	300.297583	285.605190	330.361251	223.309698	310.973405	365.897856	258.307021	225.382854	288.758614
	WA-Experience Perth-Other	42.414022	35.587768	48.137064	61.052428	53.626864	71.848886	76.416960	53.235766	86.687717	39.319652	...	75.186176	109.968477	72.411716	98.337989	103.825823	89.863976	79.051248	117.422706	124.913847	87.494916
	WA-Experience Perth-Visiting	226.134737	237.021622	235.038378	291.621175	274.072739	267.297383	190.483508	217.585759	224.025084	254.090528	...	439.624378	406.907362	538.265046	319.926218	365.445717	439.699451	356.867038	302.296119	373.442070	455.316702

425 rows × 80 columns

As can be seen, by taking the dot product of our summing matrix df_S with our bottom-level actuals, we have obtained a matrix where the aggregations are the rows and the timesteps the columns.

We can now proceed to create forecasts.

[14]:

# Create forecasts: a simple ETS model per timeseries
from sktime.forecasting.ets import AutoETS
forecasts = pd.DataFrame(index=actuals.index, columns = actuals.columns, dtype=np.float32)
for index, series in actuals.iterrows():
    # Fit model and predict (we need to clip because otherwise there's a convergence error)
    model = AutoETS(auto=True, n_jobs=1, random_state=0)
    model.fit(np.clip(series.loc[:end_train], 1e-3, 1e16))
    forecast = model.predict(series.index)
    # Store to forecasts/actuals array
    forecasts.loc[index] = forecast.values

c:\Users\ospra\miniconda3\envs\phd\lib\site-packages\statsmodels\base\model.py:606: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  ConvergenceWarning)

Let’s look at the forecasts:

[15]:

forecasts

[15]:

	Quarter	1998Q1	1998Q2	1998Q3	1998Q4	1999Q1	1999Q2	1999Q3	1999Q4	2000Q1	2000Q2	...	2015Q3	2015Q4	2016Q1	2016Q2	2016Q3	2016Q4	2017Q1	2017Q2	2017Q3	2017Q4
Aggregation	Value
Total	Total	21574.027344	22083.279297	21525.980469	20987.857422	20937.910156	21301.900391	21351.449219	20896.318359	20621.328125	21165.349609	...	23663.076172	23606.921875	24092.445312	24092.445312	24092.445312	24092.445312	24092.445312	24092.445312	24092.445312	24092.445312
State	ACT	494.419067	500.439117	501.933624	502.476624	502.227631	498.071075	498.810791	496.339600	498.825104	503.382477	...	545.991577	560.809204	574.234619	586.928467	599.622253	612.316101	625.009949	637.703796	650.397583	663.091431
	NSW	7604.467773	7721.035645	7572.416992	7351.645020	7333.018066	7400.431641	7307.677734	7151.299316	7147.140137	7186.997070	...	7237.438477	7163.483398	7323.218262	7323.218262	7323.218262	7323.218262	7323.218262	7323.218262	7323.218262	7323.218262
	NT	345.485168	345.468750	345.465607	345.483917	345.474121	345.458069	345.460144	345.475739	345.466034	345.452087	...	345.443909	345.477600	345.485382	345.485382	345.485382	345.485382	345.485382	345.485382	345.485382	345.485382
	QLD	4156.833496	4131.653809	4095.944336	4204.534668	4204.162598	4232.147949	4361.320312	4504.533203	4470.797363	4458.242676	...	5214.271973	5356.412598	5394.749512	5394.749512	5394.749512	5394.749512	5394.749512	5394.749512	5394.749512	5394.749512
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
bottom_timeseries	WA-Australia's South West-Visiting	117.684669	117.533379	115.098701	112.458824	121.660019	124.356834	132.234741	126.906319	125.141609	122.065361	...	193.466293	177.774765	187.432251	187.432251	187.432251	187.432251	187.432251	187.432251	187.432251	187.432251
	WA-Experience Perth-Business	154.717407	160.191299	153.360840	149.374878	155.738220	157.148788	151.388489	150.984451	147.399536	139.714096	...	184.499893	189.875946	202.637924	202.637924	202.637924	202.637924	202.637924	202.637924	202.637924	202.637924
	WA-Experience Perth-Holiday	283.353638	285.672760	283.983185	277.134186	280.175385	292.077759	292.239594	288.508484	271.932098	286.842621	...	270.762085	273.418518	278.923096	278.923096	278.923096	278.923096	278.923096	278.923096	278.923096	278.923096
	WA-Experience Perth-Other	46.703793	45.286228	42.081352	44.082474	49.690231	50.991100	57.883602	64.007988	60.448284	69.119164	...	90.407684	85.377708	93.503777	93.503777	93.503777	93.503777	93.503777	93.503777	93.503777	93.503777
	WA-Experience Perth-Visiting	231.525131	229.379745	232.421234	233.462860	256.609985	263.560211	265.047607	235.370941	228.292404	226.593994	...	382.790314	405.410400	406.006195	406.006195	406.006195	406.006195	406.006195	406.006195	406.006195	406.006195

425 rows × 80 columns

Observations:

The AutoETS in this case isn’t very powerful: it has mostly created constant forecasts for all timesteps in our test set (see after 2015Q4).
We create forecasts for both the train and test set. This is because we need the residuals (forecast errors) on our training set.

Reconciliation

We can now reconcile the forecasts.

[16]:

# Create forecast test set and apply a set of reconciliation methods.
forecasts_test = forecasts.loc[:, start_test:]
forecasts_methods = apply_reconciliation_methods(forecasts_test, df_S,
                                                actuals.loc[:, :end_train], forecasts.loc[:, :end_train],
                                                methods=['ols', 'wls_struct', 'wls_var', 'mint_shrink'])

Method ols, reconciliation time: 0.0021s
Method wls_struct, reconciliation time: 0.0013s
Method wls_var, reconciliation time: 0.0012s
Method mint_shrink, reconciliation time: 4.1298s

We will compare our reconciled forecasts to two cases: (i) the unreconciled forecasts (‘base’), and (ii) the bottom-up forecasts (‘bottom-up’). The latter is obtained by merely summing our bottom-level forecasts for all the aggregations (i.e. we effectively dispense with the separate forecasts for all the higher-level aggregations).

[19]:

# Create the bottom-up forecasts and add them to the forecasts_methods dataframe
forecasts_bu_bottom_level = forecasts.loc['bottom_timeseries']
forecasts_bu = aggregate_bottom_up_forecasts(forecasts_bu_bottom_level, df_S)
forecasts_bu_test = forecasts_bu.loc[:, start_test:]
forecasts_method = pd.concat({'bottom-up': forecasts_bu_test}, names=['Method'])
forecasts_methods = pd.concat((forecasts_method, forecasts_methods), axis=0)

Finally, we can compute the root mean-squared error on all of our methods.

[20]:

# Calculate error for all levels and methods. We set bottom-up as the base method to compare against
# in the relative rmse
rmse, rel_rmse = calc_level_method_rmse(forecasts_methods, actuals, base='base')

Let’s look at the errors:

[21]:

rmse

[21]:

Method	bottom-up	base	ols	wls_struct	wls_var	mint_shrink
Aggregation
Total	3306.986700	2384.445914	2404.999427	2738.143230	2927.180695	1755.004329
Purpose	1087.956677	1070.998394	1060.990947	1029.587893	1025.144238	791.360905
State	654.361105	567.977400	558.927221	591.119925	614.470862	464.545683
State-Purpose	221.070602	219.723817	218.963422	215.304224	215.310404	183.111974
State-Region	93.707037	94.867123	87.337924	88.549785	87.412677	77.612773
bottom_timeseries	36.514784	36.514784	36.118552	35.815110	35.462472	33.346821
All series	226.133054	190.903411	189.715393	200.013370	207.160978	146.971670

We can see the following:

The base forecast, where we create a separate forecast for each time series in each aggregation, has an overall RMSE of 191.
The base forecast has been reconciled using ‘ols’, ‘wls_struct’, ‘wls_var’ and ‘mint_shrink’. Of these 4 methods, ‘mint_shrink’ achieves superior performance; across all levels the forecast is better than the base forecast.
The bottom-up forecast performs rather poorly. Note that (by definition) base == bottom-up for the Bottom level aggregation.

Finally, let’s look at the relative rmse. This provides an easier insight into which method performs best. We compare against the base forecast.

[22]:

rel_rmse

[22]:

Method	bottom-up	base	ols	wls_struct	wls_var	mint_shrink
Aggregation
Total	1.386899	1.0	1.008620	1.148335	1.227615	0.736022
Purpose	1.015834	1.0	0.990656	0.961335	0.957186	0.738900
State	1.152090	1.0	0.984066	1.040746	1.081858	0.817895
State-Purpose	1.006129	1.0	0.996539	0.979886	0.979914	0.833373
State-Region	0.987771	1.0	0.920634	0.933409	0.921422	0.818121
bottom_timeseries	1.000000	1.0	0.989149	0.980839	0.971181	0.913242
All series	1.184542	1.0	0.993777	1.047720	1.085161	0.769875

Clearly, ‘mint_shrink’ provides the best performance, across all series it performs about 25% better than the base forecast.