In this follow-up article, I continue my quest to build Frankenstein’s time series monster by combining ideas from the popular Prophet package¹ and the talk “Winning with Simple, Even Linear Models”².
After remembering what we are doing, we will address the regression model: what it is and why it is special.
We will then move on to hyperparameter tuning using time series cross-validation to obtain an “optimal” parameterization of the model.
Finally, we will validate the model using SHAP before leveraging the model form to allow for custom investigations and manual adjustments.
There is a lot of ground to cover; Let’s get to work.
Aside: we covered the underlying data preparation and feature engineering in a previous article, so let’s jump right into the modeling. Catch up on what happened there:
Let’s remember what we are doing.
The ultimate goal is simple: generate the most accurate forecast of future events over a specific time horizon.
We start from scratch with a time series containing only a date variable and the quantity of interest. From this, we derive additional features that help us model future outcomes accurately; these were strongly “inspired” by Prophet’s approach.
That brings us to where we are now: ready to feed our engineering data into a lightweight model, training it to forecast the future. Later we will delve into the internal functioning of the model.
Let’s remember what the data looks like before continuing.
We are using real world data from the UK; in this case, the STATS19 traffic accident dataset that…