5.9 ARIMA and SARIMA Models

5.9.1 ARIMA Models

ARIMA models are the same as ARMA models, but the i' stands for 'integrated', which refers to differencing. If you were differencing to remove the trend and seasonality, you could incorporate that difference in the model fitting procedure (like we did with the parametric modeling andxreg=`).

Below, we show fitting the model on the ORIGINAL data before removing the trend or seasonality. Then we do a first difference to remove the trend and a first-order difference of lag 12 to remove the seasonality.

The little \(d\) indicates the order of differences for lag 1, the big \(D\) indicates the order of seasonal differences,, and \(S\) gives the lag for the seasonal differences. Once we know the differencing that we need to do, we incorporate the differencing into the model fitting, which will give us more accurate predictions for the future.

mod.fit4 <- sarima(birth, p = 0, d = 1, q = 1, D = 1, S = 12) #this model includes lag 1 differencing and lag 12 differencing
mod.fit4$fit

Notice, this model with differencing isn’t any better than the ARMA model with the spline trend and indicator variables for the month.

5.9.2 Seasonal ARIMA Models

The s in the sarima() is seasonal. A Seasonal ARIMA model allows us to add a seasonal lag (e.g., 12) into an ARMA model. The model is written as

\[\Phi(B^S)\phi(B)Y_t = \Theta(B^S)\theta(B)W_t\] where the non-seasonal components are: \[\phi(B) =1 - \phi_1B- \phi_2B^2 - \cdots - \phi_pB^p\] and \[\theta(B) = 1+ \theta_1B+ \theta_2B^2 + \cdots + \theta_qB^q\] and the seasonal components are: \[\Phi(B^S) =1 - \Phi_1B^S- \Phi_2B^{2S} - \cdots - \Phi_pB^{PS}\] and \[\Theta(B^S) = 1+ \Theta_1B^S+ \Theta_2B^{2S} + \cdots + \Theta_qB^{QS}\]

Why you might need a Seasonal ARMA?

If you see strong seasonal autocorrelation in the residuals after you fit a good ARMA model, try a seasonal ARMA.

If strong seasonal autocorrelation (non-zero values at lag \(S\), \(2S\), etc.) drops off after Q seasonal lags, you can fit a seasonal MA(Q) model.

On the other hand if you see a strong seasonal partial autocorrelation that drops off after P seasonal lags, you can fit a seasonal AR(P) model.

acf2(resid(GoodMod$fit)) #Notice the high ACF and PACF for lag 1 year (12 months) -- only look at 1 year, 2 years, etc. is there a pattern?


mod.fit5 <- sarima(birthTS$Value, p = 1,d = 0, q = 1,P = 0, D = 0, Q = 1, S = 12, xreg = X) #ARMA(1,1) + SeasonalMA(1)



mod.fit6 <- sarima(birthTS$Value, p = 1,d = 0, q = 1,P = 1, D = 0, Q = 0, S = 12, xreg = X) #ARMA(1,1) + SeasonalAR(1)
GoodMod$BIC
mod.fit6$BIC
mod.fit5$BIC

While the p-values are still not ideal, the model with spline trend, indicator variables for the month, ARMA(1,1) + SeasonalAR(1) for errors has the lowest BIC, and the SARIMA coefficients are all significantly different from zero.