In the previous article, we have tried to model the gold price in Turkey per gram. We will continue to do that to find the best fit for our data. When we chose the KNN and Arima model, we saw the traditional Arima model was much better than the KNN, which is a machine learning algorithm. This time we will try the regression model as a machine learning model and also try to improve our Arima model with some mathematical operations.
A regression that has Fourier terms is called dynamic harmonic regression. This harmonic structure is built of the successive Fourier terms that consist of sine and cosine terms to form a periodic function. These terms could catch seasonal patterns delicately.
, , ,
, , …
m is for the seasonal periods. If the number of terms increases, the period would converge to a square wave. While Fourier terms capture the seasonal pattern, the ARIMA model process the error term to determine the other dynamics like prediction intervals.
We will examine the regression models with K values from 1 to 6 and plot them down to compare corrected Akaike’s information criterion(AICc) measurement, which should be minimum. We will set the seasonal parameter to FALSE; because of that Fourier terms will catch the seasonality, we don’t want that the auto.arima function to search for seasonal patterns, and waste time. We should also talk about the transformation concept to understand the lambda parameter we are going to use in the models.
Transformation, just like differentiation, is a mathematical operation that simplifies the model and thus increases the prediction accuracy. In order to do that it stabilizes the variance so that makes the pattern more consistent. These transformations can be automatically made by the auto.arima function based on the optimum value of the lambda parameter that belongs to the BoxCox transformations which are shown below, if the lambda parameter set to “auto“.
; if
; if otherwise
plots <
list
()
for
(i
in
seq
(6)) {
fit < train %>%
auto.arima
(xreg =
fourier
(train, K = i), seasonal =
FALSE
, lambda =
"auto"
)
plots[[i]] <
autoplot
(
forecast
(fit, xreg=
fourier
(train, K=i, h=18))) +
xlab
(
paste
(
"K="
,i,
" AICC="
,
round
(fit[[
"aicc"
]],2))) +
ylab
(
""
) +
theme_light
()
}
gridExtra::
grid.arrange
(
plots[[1]],plots[[2]],plots[[3]],
plots[[4]],plots[[5]],plots[[6]], nrow=3)
You can also see from the above plots that the more K value the more toothed point forecasting line and prediction intervals we get. It is seen that after the K=3, AICC values increase significantly. Hence, K should be equals to 2 for the minimum AICC value.
fit_fourier < train %>%
auto.arima
(xreg =
fourier
(train,K=2), seasonal =
FALSE
, lambda =
"auto"
)
f_fourier< fit_fourier %>%
forecast
(xreg=
fourier
(train,K=2,h=18)) %>%
accuracy
(test)
f_fourier[,
c
(
"RMSE"
,
"MAPE"
)]
fit_fourier %>%
forecast
(xreg=
fourier
(train,K=2,h=18)) %>%
autoplot
() +
autolayer
(test) +
theme_light
() +
ylab
(
""
)
Since we are also taking into account the seasonal pattern even if it is weak, we should also examine the seasonal ARIMA process. This model is built by adding seasonal terms in the nonseasonal ARIMA model we mentioned before.
: seasonal part.
: the number of observations before the next year starts; seasonal period.

f_arima< fit_arima %>%
forecast
(h =18) %>%
accuracy
(test)
f_arima[,
c
(
"RMSE"
,
"MAPE"
)]
Conclusion
The timeseries data with weak seasonality like our data has been modeled with dynamic harmonic regression, but the accuracy results were worst than Arima models without seasonality.
In addition to that, the transformed data has been modeled with the Arima model more accurately than the one not transformed; because our data has the variance that has changed with the level of time series. Another important thing is that when we take a look at the accuracy plots of both the Arima model and Fourier regression, we can clearly see that as the forecast horizon increased, the prediction error increased with it.
The original article can be found here.
References
Credit: Data Science Central By: Selcuk Disci