Tourism

We are interested in producing hierarchical reconciled forecasts for the Tourism dataset.

This example is based on chapter 11 of:

Hyndman, R.J., & Athanasopoulos, G. (2021) Forecasting: principles and practice, 3rd edition, OTexts: Melbourne, Australia. OTexts.com/fpp3. Accessed on 05/07/2022.

The data was taken from R’s tsibble package:

Wang, E, D Cook, and RJ Hyndman (2020). A new tidy data structure to support exploration and modeling of temporal data, Journal of Computational and Graphical Statistics, 29:3, 466-478, doi:10.1080/10618600.2019.1695624.

https://tsibble.tidyverts.org/

Download the tourism.csv file at: https://github.com/elephaint/hierts/tree/main/examples/data

The code for this example can be found at:

Import packages & read data

[4]:
#%% Read packages
import pandas as pd
import numpy as np
from hierts.reconciliation import calc_summing_matrix, apply_reconciliation_methods, \
                                  aggregate_bottom_up_forecasts, calc_level_method_rmse
[5]:
#%% Read data
df = pd.read_csv('../../examples/data/tourism.csv', index_col=0)
# We need to convert the Quarter column to a PeriodIndex in order to guarantee correct order
df['Quarter'] = pd.PeriodIndex(df['Quarter'].str[0:4] + '-' + df['Quarter'].str[5:], freq='q')
[6]:
# Let's look at the data
df.head(10)
[6]:
Quarter Region State Purpose Trips
1 1998Q1 Adelaide SA Business 135.077690
2 1998Q2 Adelaide SA Business 109.987316
3 1998Q3 Adelaide SA Business 166.034687
4 1998Q4 Adelaide SA Business 127.160464
5 1999Q1 Adelaide SA Business 137.448533
6 1999Q2 Adelaide SA Business 199.912586
7 1999Q3 Adelaide SA Business 169.355090
8 1999Q4 Adelaide SA Business 134.357937
9 2000Q1 Adelaide SA Business 154.034398
10 2000Q2 Adelaide SA Business 168.776364

As you can see, we have a column indicating the time index (‘Quarter’), three columns that provide information on the hierarchy/aggregation of the data (‘State’, ‘Region’, ‘Purpose’), and a target column (‘Trips’).

Set aggregations and calculate summing matrix

For this dataset, the aggregations are in the columns: [‘State’, ‘Region’, ‘Purpose’]. These columns will be in our aggregations.

[7]:
aggregation_cols = ['State', 'Region', 'Purpose']

Next, we define the aggregations that we are interested in. In this case, we are interested in the following aggregations:

[8]:
aggregations = [['State'],
                ['State', 'Region'],
                ['State', 'Purpose'],
                ['Purpose']]

Don’t include the top (total) level and bottom-level: these will be added automatically later on. Now, we can create a summing matrix, which shows how each of the bottom-level series maps to our chosen aggregations:

[9]:
df_S = calc_summing_matrix(df, aggregation_cols, aggregations)

Let’s have a look at df_S:

[10]:
df_S
[10]:
bottom_timeseries ACT-Canberra-Business ACT-Canberra-Holiday ACT-Canberra-Other ACT-Canberra-Visiting NSW-Blue Mountains-Business NSW-Blue Mountains-Holiday NSW-Blue Mountains-Other NSW-Blue Mountains-Visiting NSW-Capital Country-Business NSW-Capital Country-Holiday ... WA-Australia's North West-Other WA-Australia's North West-Visiting WA-Australia's South West-Business WA-Australia's South West-Holiday WA-Australia's South West-Other WA-Australia's South West-Visiting WA-Experience Perth-Business WA-Experience Perth-Holiday WA-Experience Perth-Other WA-Experience Perth-Visiting
Aggregation Value
Total Total 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
State ACT 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
NSW 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
NT 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
QLD 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
bottom_timeseries WA-Australia's South West-Visiting 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
WA-Experience Perth-Business 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0
WA-Experience Perth-Holiday 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0
WA-Experience Perth-Other 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0
WA-Experience Perth-Visiting 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0

425 rows × 304 columns

As you can see, df_S is a mapping between the bottom-level series (columns of df_S) and the aggregations specified. Some observations:

  • ‘Total’ contains the total across all bottom-level series. Hence, this is a row consisting of ones - the total is the sum of all the bottom-level series.

  • ‘Bottom level’ contains the bottom level series. This is an identity matrix, as each bottom-level series in the rows only maps to a single bottom level series in the columns

  • Everything in between ‘Bottom level’ and ‘Total’ denotes how we can construct the aggregations. For example, the ‘State-ACT’ aggregation is formed by summing the first four bottom-level series ACT-Canberra-Business, ACT-Canberra-Holiday, ACT-Canberra-Other and ACT-Canberra-Visiting.

Create a forecasting model for each time series

Now we can create a forecasting model for each time series in the aggregation matrix df_S

[11]:
# Set target, time_index and split of train and test.
target = 'Trips'
time_index = 'Quarter'
end_train = '2015Q4'
start_test = '2016Q1'
[12]:
# Add bottom_timeseries identifier and create actuals dataframe for all aggregations
df['bottom_timeseries'] = df[aggregation_cols].agg('-'.join, axis=1)
actuals_bottom_timeseries = df.set_index(['bottom_timeseries', time_index])[target]\
                              .unstack(1)\
                              .loc[df_S.columns]
actuals = df_S @ actuals_bottom_timeseries

Let’s look at the actuals:

[13]:
actuals
[13]:
Quarter 1998Q1 1998Q2 1998Q3 1998Q4 1999Q1 1999Q2 1999Q3 1999Q4 2000Q1 2000Q2 ... 2015Q3 2015Q4 2016Q1 2016Q2 2016Q3 2016Q4 2017Q1 2017Q2 2017Q3 2017Q4
Aggregation Value
Total Total 23182.197269 20323.380067 19826.640511 20830.129891 22087.353380 21458.373285 19914.192508 20027.925640 22339.294779 19941.063482 ... 23485.745655 25140.161222 26660.637689 24285.027757 24191.320131 26347.600973 27496.389021 26113.606708 26506.314708 27593.554214
State ACT 551.001921 416.025623 436.029011 449.798445 378.572817 558.178142 448.901196 594.825442 599.668540 557.135057 ... 688.203188 597.245569 625.141634 592.608499 572.437093 667.214141 634.368747 748.290431 631.759904 720.329370
NSW 8039.794795 7166.013805 6747.935790 7282.082371 7584.776839 7054.038705 6723.677146 7135.766945 7295.986639 6445.220168 ... 6961.250203 7760.019022 8287.872080 7660.579621 7326.304671 7995.502192 8320.703462 8285.026279 8298.257417 8542.490607
NT 181.448823 313.936152 528.436859 247.702817 184.889592 366.092786 501.431284 248.430300 206.053220 359.761925 ... 682.551824 423.202516 352.098496 469.136536 713.820750 340.695927 298.166026 621.441392 597.658397 346.061938
QLD 4041.370159 3967.904653 4593.893991 4202.829141 4332.490850 4824.480452 5018.034504 4349.835877 4413.226234 4344.159277 ... 5866.070451 5532.210644 5042.153750 5446.553651 5939.334645 6106.383853 5451.987128 5638.473535 6533.836932 5813.903667
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
bottom_timeseries WA-Australia's South West-Visiting 116.970746 106.044650 102.641667 155.877388 134.385712 161.531084 107.090992 118.579005 110.625423 120.253570 ... 119.421196 223.346491 267.934342 167.308435 150.425772 222.621753 254.044094 219.257038 188.483319 209.846009
WA-Experience Perth-Business 187.985188 118.678907 129.135987 188.048404 164.311015 122.140258 148.933004 129.196928 100.691002 200.474436 ... 217.173099 267.437376 220.232305 211.946738 143.052265 217.759483 154.757531 230.309344 282.529913 270.986641
WA-Experience Perth-Holiday 294.678011 277.422409 250.539302 291.984508 338.295122 292.867993 274.020511 207.565280 344.740832 324.050128 ... 283.733539 300.297583 285.605190 330.361251 223.309698 310.973405 365.897856 258.307021 225.382854 288.758614
WA-Experience Perth-Other 42.414022 35.587768 48.137064 61.052428 53.626864 71.848886 76.416960 53.235766 86.687717 39.319652 ... 75.186176 109.968477 72.411716 98.337989 103.825823 89.863976 79.051248 117.422706 124.913847 87.494916
WA-Experience Perth-Visiting 226.134737 237.021622 235.038378 291.621175 274.072739 267.297383 190.483508 217.585759 224.025084 254.090528 ... 439.624378 406.907362 538.265046 319.926218 365.445717 439.699451 356.867038 302.296119 373.442070 455.316702

425 rows × 80 columns

As can be seen, by taking the dot product of our summing matrix df_S with our bottom-level actuals, we have obtained a matrix where the aggregations are the rows and the timesteps the columns.

We can now proceed to create forecasts.

[14]:
# Create forecasts: a simple ETS model per timeseries
from sktime.forecasting.ets import AutoETS
forecasts = pd.DataFrame(index=actuals.index, columns = actuals.columns, dtype=np.float32)
for index, series in actuals.iterrows():
    # Fit model and predict (we need to clip because otherwise there's a convergence error)
    model = AutoETS(auto=True, n_jobs=1, random_state=0)
    model.fit(np.clip(series.loc[:end_train], 1e-3, 1e16))
    forecast = model.predict(series.index)
    # Store to forecasts/actuals array
    forecasts.loc[index] = forecast.values
c:\Users\ospra\miniconda3\envs\phd\lib\site-packages\statsmodels\base\model.py:606: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  ConvergenceWarning)

Let’s look at the forecasts:

[15]:
forecasts
[15]:
Quarter 1998Q1 1998Q2 1998Q3 1998Q4 1999Q1 1999Q2 1999Q3 1999Q4 2000Q1 2000Q2 ... 2015Q3 2015Q4 2016Q1 2016Q2 2016Q3 2016Q4 2017Q1 2017Q2 2017Q3 2017Q4
Aggregation Value
Total Total 21574.027344 22083.279297 21525.980469 20987.857422 20937.910156 21301.900391 21351.449219 20896.318359 20621.328125 21165.349609 ... 23663.076172 23606.921875 24092.445312 24092.445312 24092.445312 24092.445312 24092.445312 24092.445312 24092.445312 24092.445312
State ACT 494.419067 500.439117 501.933624 502.476624 502.227631 498.071075 498.810791 496.339600 498.825104 503.382477 ... 545.991577 560.809204 574.234619 586.928467 599.622253 612.316101 625.009949 637.703796 650.397583 663.091431
NSW 7604.467773 7721.035645 7572.416992 7351.645020 7333.018066 7400.431641 7307.677734 7151.299316 7147.140137 7186.997070 ... 7237.438477 7163.483398 7323.218262 7323.218262 7323.218262 7323.218262 7323.218262 7323.218262 7323.218262 7323.218262
NT 345.485168 345.468750 345.465607 345.483917 345.474121 345.458069 345.460144 345.475739 345.466034 345.452087 ... 345.443909 345.477600 345.485382 345.485382 345.485382 345.485382 345.485382 345.485382 345.485382 345.485382
QLD 4156.833496 4131.653809 4095.944336 4204.534668 4204.162598 4232.147949 4361.320312 4504.533203 4470.797363 4458.242676 ... 5214.271973 5356.412598 5394.749512 5394.749512 5394.749512 5394.749512 5394.749512 5394.749512 5394.749512 5394.749512
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
bottom_timeseries WA-Australia's South West-Visiting 117.684669 117.533379 115.098701 112.458824 121.660019 124.356834 132.234741 126.906319 125.141609 122.065361 ... 193.466293 177.774765 187.432251 187.432251 187.432251 187.432251 187.432251 187.432251 187.432251 187.432251
WA-Experience Perth-Business 154.717407 160.191299 153.360840 149.374878 155.738220 157.148788 151.388489 150.984451 147.399536 139.714096 ... 184.499893 189.875946 202.637924 202.637924 202.637924 202.637924 202.637924 202.637924 202.637924 202.637924
WA-Experience Perth-Holiday 283.353638 285.672760 283.983185 277.134186 280.175385 292.077759 292.239594 288.508484 271.932098 286.842621 ... 270.762085 273.418518 278.923096 278.923096 278.923096 278.923096 278.923096 278.923096 278.923096 278.923096
WA-Experience Perth-Other 46.703793 45.286228 42.081352 44.082474 49.690231 50.991100 57.883602 64.007988 60.448284 69.119164 ... 90.407684 85.377708 93.503777 93.503777 93.503777 93.503777 93.503777 93.503777 93.503777 93.503777
WA-Experience Perth-Visiting 231.525131 229.379745 232.421234 233.462860 256.609985 263.560211 265.047607 235.370941 228.292404 226.593994 ... 382.790314 405.410400 406.006195 406.006195 406.006195 406.006195 406.006195 406.006195 406.006195 406.006195

425 rows × 80 columns

Observations:

  • The AutoETS in this case isn’t very powerful: it has mostly created constant forecasts for all timesteps in our test set (see after 2015Q4).

  • We create forecasts for both the train and test set. This is because we need the residuals (forecast errors) on our training set.

Reconciliation

We can now reconcile the forecasts.

[16]:
# Create forecast test set and apply a set of reconciliation methods.
forecasts_test = forecasts.loc[:, start_test:]
forecasts_methods = apply_reconciliation_methods(forecasts_test, df_S,
                                                actuals.loc[:, :end_train], forecasts.loc[:, :end_train],
                                                methods=['ols', 'wls_struct', 'wls_var', 'mint_shrink'])
Method ols, reconciliation time: 0.0021s
Method wls_struct, reconciliation time: 0.0013s
Method wls_var, reconciliation time: 0.0012s
Method mint_shrink, reconciliation time: 4.1298s

We will compare our reconciled forecasts to two cases: (i) the unreconciled forecasts (‘base’), and (ii) the bottom-up forecasts (‘bottom-up’). The latter is obtained by merely summing our bottom-level forecasts for all the aggregations (i.e. we effectively dispense with the separate forecasts for all the higher-level aggregations).

[19]:
# Create the bottom-up forecasts and add them to the forecasts_methods dataframe
forecasts_bu_bottom_level = forecasts.loc['bottom_timeseries']
forecasts_bu = aggregate_bottom_up_forecasts(forecasts_bu_bottom_level, df_S)
forecasts_bu_test = forecasts_bu.loc[:, start_test:]
forecasts_method = pd.concat({'bottom-up': forecasts_bu_test}, names=['Method'])
forecasts_methods = pd.concat((forecasts_method, forecasts_methods), axis=0)

Finally, we can compute the root mean-squared error on all of our methods.

[20]:
# Calculate error for all levels and methods. We set bottom-up as the base method to compare against
# in the relative rmse
rmse, rel_rmse = calc_level_method_rmse(forecasts_methods, actuals, base='base')

Let’s look at the errors:

[21]:
rmse
[21]:
Method bottom-up base ols wls_struct wls_var mint_shrink
Aggregation
Total 3306.986700 2384.445914 2404.999427 2738.143230 2927.180695 1755.004329
Purpose 1087.956677 1070.998394 1060.990947 1029.587893 1025.144238 791.360905
State 654.361105 567.977400 558.927221 591.119925 614.470862 464.545683
State-Purpose 221.070602 219.723817 218.963422 215.304224 215.310404 183.111974
State-Region 93.707037 94.867123 87.337924 88.549785 87.412677 77.612773
bottom_timeseries 36.514784 36.514784 36.118552 35.815110 35.462472 33.346821
All series 226.133054 190.903411 189.715393 200.013370 207.160978 146.971670

We can see the following:

  • The base forecast, where we create a separate forecast for each time series in each aggregation, has an overall RMSE of 191.

  • The base forecast has been reconciled using ‘ols’, ‘wls_struct’, ‘wls_var’ and ‘mint_shrink’. Of these 4 methods, ‘mint_shrink’ achieves superior performance; across all levels the forecast is better than the base forecast.

  • The bottom-up forecast performs rather poorly. Note that (by definition) base == bottom-up for the Bottom level aggregation.

Finally, let’s look at the relative rmse. This provides an easier insight into which method performs best. We compare against the base forecast.

[22]:
rel_rmse
[22]:
Method bottom-up base ols wls_struct wls_var mint_shrink
Aggregation
Total 1.386899 1.0 1.008620 1.148335 1.227615 0.736022
Purpose 1.015834 1.0 0.990656 0.961335 0.957186 0.738900
State 1.152090 1.0 0.984066 1.040746 1.081858 0.817895
State-Purpose 1.006129 1.0 0.996539 0.979886 0.979914 0.833373
State-Region 0.987771 1.0 0.920634 0.933409 0.921422 0.818121
bottom_timeseries 1.000000 1.0 0.989149 0.980839 0.971181 0.913242
All series 1.184542 1.0 0.993777 1.047720 1.085161 0.769875

Clearly, ‘mint_shrink’ provides the best performance, across all series it performs about 25% better than the base forecast.