OPEN-SOURCE SCRIPT

Hybrid Triple Exponential Smoothing

🙏🏻 TV, I present you HTES aka Hybrid Triple Exponential Smoothing, designed by Holt & Winters in the US, assembled by me in Saint P. I apply exponential smoothing individually to the data itself, then to residuals from the fitted values, and lastly to one-point forecast (OPF) errors, hence 'hybrid'. At the same time, the method is a closed-form solution and purely online, no need to make any recalculations & optimize anything, so the method is O(1).

תמונת-בזק
^^ historical OPFs and one-point forecasting interval plotted instead of fitted values and prediction interval


Before the How-to, first let me tell you some non-obvious things about Triple Exponential smoothing (and about Exponential Smoothing in general) that not many catch. Expo smoothing seems very straightforward and obvious, but if you look deeper...

1) The whole point of exponential smoothing is its incremental/online nature, and its O(1) algorithm complexity, making it dope for high-frequency streaming data that is also univariate and has no weights. Consequently:

- Any hybrid models that involve expo smoothing and any type of ML models like gradient boosting applied to residuals rarely make much sense business-wise: if you have resources to boost the residuals, you prolly have resources to use something instead of expo smoothing;
- It also concerns the fashion of using optimizers to pick smoothing parameters; honestly, if you use this approach, you have to retrain on each datapoint, which is crazy in a streaming context. If you're not in a streaming context, why expo smoothing? What makes more sense is either picking smoothing parameters once, guided by exogenous info, or using dynamic ones calculated in a minimalistic and elegant way (more on that in further drops).

2) No matter how 'right' you choose the smoothing parameters, all the resulting components (level, trend, seasonal) are not pure; each of them contains a bit of info from the other components, this is just how non-sequential expo smoothing works. You gotta know this if you wanna use expo smoothing to decompose your time series into separate components. The only pure component there, lol, is the residuals;

3) Given what I've just said, treating the level (that does contain trend and seasonal components partially) as the resulting fit is a mistake. The resulting fit is level (l) + trend (b) + seasonal (s). And from this fit, you calculate residuals;

4) The residuals component is not some kind of bad thing; it is simply the component that contains info you consciously decide not to include in your model for whatever reason;

5) Forecasting Errors and Residuals from fitted values are 2 different things. The former are deltas between the forecasts you've made and actual values you've observed, the latter are simply differences between actual datapoints and in-sample fitted values;

6) Residuals are used for in-sample prediction intervals, errors for out-of-sample forecasting intervals;

7) Choosing between single, double, or triple expo smoothing should not be based exclusively on the nature of your data, but on what you need to do as well. For example:

- If you have trending seasonal data and you wanna do forecasting exclusively within the expo smoothing framework, then yes, you need Triple Exponential Smoothing;
- If you wanna use prediction intervals for generating trend-trading signals and you disregard seasonality, then you need single (simple) expo smoothing, even on trending data. Otherwise, the trend component will be included in your model's fitted values → prediction intervals.

8) Kind of not non-obvious, but when you put one smoothing parameter to zero, you basically disregard this component. E.g., in triple expo smoothing, when you put gamma and beta to zero, you basically end up with single exponential smoothing.

תמונת-בזק
^^ data smoothing, beta and gamma zeroed out, forecasting steps = 0


About the implementation

* I use a simple power transform that results in a log transform with lambda = 0 instead of the mainstream-used transformers (if you put lambda on 2 in Box-Cox, you won't get a power of 2 transform)
* Separate set of smoothing parameters for data, residuals, and errors smoothing
* Separate band multipliers for residuals and errors
* Both typical error and typical residuals get multiplied by math.sqrt(math.pi / 2) in order to approach standard deviation so you can ~use Z values and get more or less corresponding probabilities
* In script settings → style, you can switch on/off plotting of many things that get calculated internally:
- You can visualize separate components (just remember they are not pure);
- You can switch off fit and switch on OPF plotting;
- You can plot residuals and their exponentially smoothed typical value to pick the smoothing parameters for both data and residuals;
- Or you might plot errors and play with data smoothing parameters to minimize them (consult SAE aka Sum of Absolute Errors plot);

תמונת-בזק

^^ nuff said


More ideas on how to use the thing

1) Use Double Exponential Smoothing (data gamma = 0) to detrend your time series for further processing (Fourier likes at least weakly stationary data);
2) Put single expo smoothing on your strategy/subaccount equity chart (data alpha = data beta = 0), set prediction interval deviation multiplier to 1, run your strat live on simulator, start executing on real market when equity on simulator hits upper deviation (prediction interval), stop trading if equity hits lower deviation on simulator. Basically, let the strat always run on simulator, but send real orders to a real market when the strat is successful on your simulator;
3) Set up the model to minimize one-point forecasting errors, put error forecasting steps to 1, now you're doing nowcasting;
4) Forecast noisy trending sine waves for fun.

תמונת-בזק

^^ nuff said 2

All Good TV ∞

כתב ויתור