An ARIMA model is a class of statistical models for analyzing and forecasting time series data.
Let’s Break it Down:-
- AR: Autoregression. A model that uses the dependent relationship between an observation and some number of lagged observations.
- I: Integrated. The use of differencing of raw observations in order to make the time series stationary.
- MA: Moving Average. A model that uses the dependency between an observation and a residual error from a moving average model applied to lagged observations.
The parameters of the ARIMA model are defined as follows:
- p: The number of lag observations included in the model, also called the lag order.
- d: The number of times that the raw observations are differenced, also called the degree of differencing.
- q: The size of the moving average window, also called the order of moving average.
for this, the first thing you need to know is about Stationarity:-
Stationarity means that the statistical properties of a process generating a time series do not change over time.
So, if you plot the model at first, and you are getting a Stationary model then there is no need of differencing. In that case d=0.
But, if the plot is not stationary then you need to do the differencing till it starts to follow a Stationary trend. and the number of times you are doing the differencing is the d value.
So, how to do this:-
- First, when you are getting the data, perform Dicky Fuller Test with the data.
here we are checking the p-value of the data and if it is greater than 0.05, then the data is not-stationary and if it is smaller than or equal to 0.05 then it is a stationary data.
2) Next how to do the Differencing:-
this is the way to do the differencing. here the value 30 is the number of index values per period of time you are calculating.
3) Here these two graphs will help you to find the p and q values.
Partial AutoCorrelation Graph is for the p-value.
AutoCorrelation Graph for the q-value.
This link below will help you find the values:-
1. Microsoft Azure Machine Learning x Udacity — Lesson 4 Notes
2. Fundamentals of AI, ML and Deep Learning for Product Managers
3. Roadmap to Data Science
4. Work on Artificial Intelligence Projects
after that import the statsmodel.api and pass the data and the p,q and d values.
Results = model.fit()
Boooom ! your model is trained ….Now you are good to go….predict and plot your predictions.
The difference between ARIMA and SARIMA (SARIMAX) is about the seasonality of the dataset. if your data is seasonal, like it happen after a certain period of time. then we will use SARIMA.
here we will have to add one more term that is seasonal_order(p,d,q,period)
p,q,d values will remains the same.
period value will be the value after what period of time seasonality occurs.
Gain Access to Expert View — Subscribe to DDI Intel