STAT 4280 Final Project
<!DOCTYPE html>
STAT 4280 Final Project
Andrew Benecchi
12/7/2020
Data Set Title
## Feather River monthly flow rate in cubic meters/s at Oroville, California, Oct.1902 – Sep. 1961
About the Feather River
The Feather River is the principal tributary of the Sacramento River, which discharges into the Suisun Bay, a tidal estuary of the San Francisco Bay Area. The measurements in the time series, according to the Time Series Data Library, were conducted between 10/1902 and 09/1977, but the data actually only appears to be complete through 09/1961, the same year which construction on the Oroville Dam began in earnest; the partially-completed Oroville Dam mitigated the local impact of the Pacific Northwest-wide Christmas floods of 1964, which caused a peak flow of 7100 \(\frac{m^{3}}{s}\). The Oroville Dam is the tallest dam in the US, at 235 m in height, and uses 480 \(\frac{m^{3}}{s}\) of its flow limit of 4200 \(\frac{m^{3}}{s}\) to generate power.
Time series of data
The data need a variable transformation in order to stabilize the variance; a Box-Cox analysis should optimize the \(\lambda\) necessary to stabilize the variance as much as possible.
Box-Cox Transformation
Analysis Techniques
The data is split into training (n=672) and testing (n=36) datasets for most graphical comparison of prediction errors, though different testing-training splits are necessary to evaluate the consistency of the quality of the models. SARIMA models are chosen using analysis of the periodogram and the sample’s year-over-year \((1-B^{12})\), and differenced year-over-year \((1-B)(1-B^{12})\) ACF and PACF graphs. Quality of the SARIMA models is evaluated by comparing AICs, BICs, RMSE, MAPE, and MRPE, the latter of which is calculated as follows:
\[ \frac{1}{t_{max}} \sum_{1}^{t_{max}}\frac{|Prediction_{i} - Actual_{i}|}{Actual_{i}} \]
Once a final candidate SARIMA model is found, then the best models through LOWESS decomposition and exponential smoothing are found. The LOWESS models are compared by RMSE across different variable transformations (with RMSE being representative of the gap between actual and the model’s detransformed seasonal+trend parts), and the predictions are compared across the four types (ETS, ARIMA, naive (i.e. no evolution in predicted points), and random walk with drift), resulting in 12 possible models that may provide the best MAPE for a given testing length. The exponential smoothing methods are compared by RMSE and MAPE across variable transformations and by seasonality calculation method (i.e. additive or multiplicative). The best models found through triple exponential smoothing and decomposition using the classical method and locally-weighted least-squares are compared to the SARIMA model.
Analyzing the transformed data
The data appear to have a periodic nature, so analyzing the ACF, PACF, and periodogram are necessary to develop a candidate model.
ACF and PACF of transformed data
Analysis of ACF, PACF
The ACF has a persistent sinusoidal pattern, with peaks at lags with annual recurrence, troughs lagged six months behind, similar to a cosine wave with weak decay. The PACF has a dampened sinusoidal pattern, similar to \(-e^{-kx}sin{\frac{\pi x}{6}}\) but because \(\phi_{11}=\rho(X_{t},X_{t-1})\) the first partial autocorrelation is positive.
Periodogram of Data
The periodogram has a strong spike at annual recurrence, so investigating the annual differences is necessary.
Year-over-year changes
ACF, PACF of Annually-Differenced Data
Analysis of year-over-year change
The ACF exhibits a dampened oscillation that crosses \(\rho=0\) every 12 lags around \(\rho=6+12k\), with a strong trough at \(l=12\). There may be an integrated term in the non-seasonal part, the seasonal part, both, or neither. If there is not seasonal integration, then the seasonal part may be an \(ARMA\) form with a large seasonal autoregressive term. The PACF has significant \(12k\), \(12k+1\) lags of opposite signs that get dampened as \(k\) increases; the \(\phi\) at \(l=12k, 12k+1\) decay slowly without sign change.The seasonal part includes a seasonal moving average term, and if there is a non-seasonal autoregressive term, then that explains the opposing signs of the \(\phi\) at \(l=12k, 12k+1\).
Inspecting the monthly change in year-over-year change
Analysis of monthly change in YoY change
There is a strong negative spike at \(l=12\) in ACF that cuts off, with smaller negative significant lags for \(l=1,2\). The PACF has negative spikes without changing signs for \(l=12k\), tapering off. There is a strong case to conjecture that the model has a seasonal moving average, as that explains the taper in the lag spikes of the PACF, while a non-seasonal autoregressive term would explain the significant ACF lags at \(l=1,2\)
The parameter combinations explored are:
\[ SARIMA(0 \leq p \leq 1, 0 \leq d \leq 1, 0\leq q \leq 2)\times(0 \leq P \leq 1,0 \leq D \leq 1, Q=1)_{s=12} \]
Candidate models:
Seasonally-Integrated SARIMA (\(AR(1) \times SIMA(1,1)_{12}\))
Seasonally-Integrated SARIMA (\(ARMA(1,2) \times SIMA(1,1)_{12}\))
Integrated and Seasonally-Integrated SARIMA (\(ARIMA(1,1,1) \times SIMA(1,1)_{12}\))
Seasonal ARMA (\(AR(1) \times SARMA(1,1)_{12}\))
ARIMA with seasonal part (\(ARIMA(1,1,1)\times SARMA(1,1)_{12}\))
Why each?
Each has best AIC in its subtype
Best seasonally-integrated AR
Best seasonally-integrated ARMA
Best integrated seasonally-integrated ARMA
Best non-integrated model
Best integrated model with seasonal part
Model Coefficients
##
## Call:
## arima(x = ., order = c(1, 0, 0), seasonal = list(order = c(0, 1, 1), period = 12))
##
## Coefficients:
## ar1 sma1
## 0.674 -0.9532
## s.e. 0.029 0.0256
##
## sigma^2 estimated as 0.005573: log likelihood = 761.53, aic = -1519.06
##
## Call:
## arima(x = ., order = c(1, 0, 2), seasonal = list(order = c(0, 1, 1), period = 12))
##
## Coefficients:
## ar1 ma1 ma2 sma1
## 0.8083 -0.1733 -0.1204 -0.9507
## s.e. 0.0481 0.0650 0.0544 0.0236
##
## sigma^2 estimated as 0.005528: log likelihood = 764.55, aic = -1521.11
##
## Call:
## arima(x = ., order = c(1, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12))
##
## Coefficients:
## ar1 ma1 sma1
## 0.6419 -0.9797 -0.9631
## s.e. 0.0334 0.0115 0.0241
##
## sigma^2 estimated as 0.005549: log likelihood = 758.45, aic = -1510.89
##
## Call:
## arima(x = ., order = c(1, 0, 0), seasonal = list(order = c(1, 0, 1), period = 12),
## include.mean = T)
##
## Coefficients:
## ar1 sar1 sma1 intercept
## 0.6795 0.9986 -0.9541 2.2496
## s.e. 0.0286 0.0017 0.0266 0.0582
##
## sigma^2 estimated as 0.005565: log likelihood = 777.39, aic = -1546.78
##
## Call:
## arima(x = ., order = c(1, 1, 1), seasonal = list(order = c(1, 0, 1), period = 12))
##
## Coefficients:
## ar1 ma1 sar1 sma1
## 0.6574 -0.9862 0.9990 -0.9597
## s.e. 0.0318 0.0083 0.0013 0.0244
##
## sigma^2 estimated as 0.00555: log likelihood = 775.7, aic = -1543.4
The 4th candidate model provides the strongest AIC, but other analysis methods are necessary for more confidence in the model selection process.
Comparing year-over-year ACFs
Comparing year-over-year PACFs
Comparing differenced year-over-year change ACFs
Comparing differenced year-over-year change PACFs
The MA term decay is captured in PACFs for both year-over-year \(\Delta\) and monthly \(\Delta\) in year-over-year \(\Delta\), but the variability in \(\rho\) and \(\phi\) at later lags worthy of investigation mentioned previously that cannot be captured by any SARIMA model. In order to be more robust in the model-selection process, different training-testing splits should be tested to understand how each model’s quality evolves over time. Similar methods should be applied to the other time series’ model selection methods.
Best Model Selector for SARIMA
## user system elapsed
## 0.05 0.10 7.35
## training.years which.min.AIC which.min.BIC which.min.RMSE which.min.MAPE
## 1 31 4 4 5 1
## 2 32 4 4 5 1
## 3 33 4 4 5 2
## 4 34 4 4 5 2
## 5 35 4 4 5 2
## 6 36 4 4 5 2
## 7 37 4 4 4 3
## 8 38 4 4 4 1
## 9 39 4 4 4 2
## 10 40 4 4 4 3
## 11 41 4 4 4 2
## 12 42 4 4 4 1
## 13 43 4 4 4 1
## 14 44 4 4 4 3
## 15 45 4 4 4 3
## 16 46 4 4 4 3
## 17 47 4 4 4 3
## 18 48 4 4 4 1
## 19 49 4 4 4 2
## 20 50 4 4 4 3
## 21 51 4 4 4 3
## 22 52 4 4 4 2
## 23 53 4 4 4 4
## 24 54 4 4 4 3
## 25 55 4 4 4 2
## 26 56 4 4 4 2
## 27 57 4 4 4 4
## 28 58 4 4 4 2
## which.min.MRPE
## 1 4
## 2 4
## 3 4
## 4 4
## 5 4
## 6 4
## 7 4
## 8 4
## 9 4
## 10 4
## 11 4
## 12 4
## 13 4
## 14 4
## 15 4
## 16 4
## 17 4
## 18 4
## 19 4
## 20 4
## 21 4
## 22 4
## 23 4
## 24 4
## 25 4
## 26 4
## 27 4
## 28 2
Model 4 had the best AIC and BIC all around, while models 4 and 5 had the best RMSE, while model 4 was outclassed in MAPE by the first three models, but model 4 had the best MRPE except for when the testing period was 12 months, which is a relatively inadequate testing period.
Plotting RMSE, MAPE, MRPE
## cm1AICX cm2AICX cm3AICX cm4AICX cm5AICX cm1BICX cm2BICX cm3BICX
## 4.000000 3.000000 5.000000 1.000000 2.000000 3.000000 4.000000 5.000000
## cm4BICX cm5BICX cm1RMSE cm2RMSE cm3RMSE cm4RMSE cm5RMSE cm1MAPE
## 1.000000 2.000000 4.000000 3.000000 5.000000 1.214286 1.785714 2.107143
## cm2MAPE cm3MAPE cm4MAPE cm5MAPE cm1MRPE cm2MRPE cm3MRPE cm4MRPE
## 2.000000 3.607143 3.464286 3.821429 3.107143 3.035714 5.000000 1.035714
## cm5MRPE
## 2.821429
Predicted vs. Actual, 5 candidate models
Candidate model chosen
\(SARMA(1,0)\times(1,1)_{s=12}\) with mean
\((1-\phi B)(1-\Phi B^{12})(\frac{X_{t}^{\lambda}-1}{\lambda}-\mu)=(1+\Theta B^{12})Z_{t}, \lambda=-0.36, Z_{t}\sim N(0,\sigma^{2})\)
For 10/1902-09/1958, \(\phi=0.6795, \Phi=0.9986, \Theta=-0.9541, \mu=2.2496, \sigma^{2}=5.565\times 10^{-3}\)
Model Diagnostics
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.556113e-03 8.892540e-04 7.372599 4.951536e-13
## t -2.945301e-06 2.289459e-06 -1.286462 1.987263e-01
## n w/ smallest p p value
## 1.0000000 0.6049604
##
## Augmented Dickey-Fuller Test
##
## data: diff(bftrn, lag = 12)
## Dickey-Fuller = -8.9017, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
There is no secular trend in squared residual value, which indicates a lack of residual heteroscedasticity (but importantly doesn’t confirm nor deny whether autoregressive conditional heteroscedasticity exists within the residuals); the insignificance of Ljung-Box test implies model is likely adequate. The Augmented Dickey-Fuller test on the annually differenced Box-Cox transformed data provides evidence that the annually-differenced data express stationarity, which is consistent with the lack of an integrated term in the candidate model.
Model diagnostic plots
There is no apparent heteroscedasticity in the residual plot, and there are no significant residual lags, so the AR/MA terms were likely captured using model 4. The Ljung-Box test p-values are insignificant and increasing, indicating that the model is likely adequate.
Predicted vs. actual, 10/1958-09/1961 with PIs
##
## Call:
## arima(x = ., order = c(1, 0, 0), seasonal = list(order = c(1, 0, 1), period = 12),
## include.mean = T)
##
## Coefficients:
## ar1 sar1 sma1 intercept
## 0.6807 0.9989 -0.9587 2.2496
## s.e. 0.0277 0.0015 0.0257 0.0578
##
## sigma^2 estimated as 0.005446: log likelihood = 826.8, aic = -1645.6
Final Model if SARMA
\((1-\phi B)(1-\Phi B^{12})(\frac{X_{t}^{\lambda}-1}{\lambda}-\mu)=(1+\Theta B^{12})Z_{t}, Z_{t}\sim N(0,\sigma^{2})\)
\(\sigma^{2}=5.446\times 10^{-3}, \mu=2.2496\)
\(\phi=0.6807, \Phi=0.9989, \Theta=-0.9587\)
Future Predictions for SARMA
Limitations of SARMA
There are possible ARCH-GARCH effects, so it is helpful to investigate ACF, PACF, of squared residuals and difference of squared residuals lag 12.
ACF, PACF of squared residuals
ACF of differenced squared residuals lag 12
Analysis of ARCH plots
There is a cyclical ACF and PACF but it is barely significant, but annual differencing reveals a strong negative autocorrelation for \(l=12\) that cuts off and a negative partial autocorrelation for \(l=12k\) that tapers. The squared residuals have an ACF similar to \(SIMA(0,0)\times(1,1)_{s=12}\) which may indicate GARCH effects.
Additive and multiplicative decomposition
Classical decomposition shows that the trend term does not fan out or tend to trend in the positive or negative direction, and the random part does not appear to have a bias towards positive or negative residuals, though there may be volatility clustering.
LOWESS decomposition
For LOWESS decomposition, because the Box-Cox fit for lambda was based off an autoregressive assumption, the un-transformed and log-transformed data should be fitted to models and compared.
## linearRMSE logRMSE
## 4 24
## ets.linearMAPE ets.bxcxMAPE arima.linearMAPE arima.logMAPE
## 3 1 10 2
## arima.bxcxMAPE naive.linearMAPE naive.logMAPE rwdrift.logMAPE
## 4 5 2 1
Plotting MAPE over time
Assuming that the remainder part is an ARIMA model minimized the MAPE and using a logarithmic variable transformation minimized the RMSE and MAPE. RMSE was calculated by calculating the standard error of the detransformed differences between the actual data points and the STL model minus the random part.
Plotting forecasting for 36 months ahead
## SARIMA MAPE ETS MAPE ARIMA MAPE Naive MAPE RWD MAPE
## 1 69.45628 70.73833 68.76745 76.91412 77.74366
## integer(0)
For a 36-month time scale, the ARIMA+log STL method of forecasting performed better than the original SARMA model from the first part. Further comparison across different time scales should be conducted to clear the ambiguity of model strength, but this should occur at the end.
ETS
Because the data expresses strong seasonality and lacks a secular trend, the ideal model to use is the triple exponential smoothing method. For Box-Cox and logarithmic transformations, the prediction interval increases in width so greatly that cataclysmic droughts and floods fall within the 95% prediction interval.
The additive linear model results in data points which are negative, so it can be ignored as the data should always be nonnegative.
## lin.mul.RMSE
## 28
## < table of extent 0 >
## lin.mul.MAPE log.add.MAPE
## 1 68.35443 75.67659
Exponential triple smoothing with a multiplicative seasonal part on untransformed data results in the lowest RMSE, but logarithmic additive produced similar RMSEs for all training splits. The Box-Cox transformation resulted in predictions that exceeded the bounds of the Box-Cox transformation’s domain, so comparing MAPE was limited to the two best RMSE fits, so logarithmic additive exponential triple smoothing is the best means of using ETS to predict the data.
Final Comparison
The MAPE on the ETS-derived models for 36 months of testing is significantly inferior to the MAPE on the other methods of the models, so the STL and SARMA models are most likely to be the best.
## sc1.cm4MAPE dck.arima.logMAPE
## 1 84.69805 77.86466
## 2 90.02307 83.01401
## 3 94.50622 87.24115
## 4 90.93250 83.76490
## 5 88.62440 81.76725
## 6 89.95994 84.61783
## 7 78.02234 74.30514
## 8 83.41159 77.39139
## 9 78.50312 73.84280
## 10 73.46138 70.15435
## 11 68.06453 68.71800
## 12 66.90419 70.13064
## 13 69.62399 69.14953
## 14 71.42231 70.81762
## 15 72.93750 71.36184
## 16 77.37083 73.06482
## 17 78.72947 75.24696
## 18 85.02636 81.17384
## 19 88.27044 85.72201
## 20 85.17240 80.83737
## 21 72.09211 69.90177
## 22 71.36616 70.41919
## 23 77.74756 78.03116
## 24 85.11425 78.75428
## 25 63.96372 64.60988
## 26 69.45628 68.76745
## 27 40.71886 63.48764
## 28 42.69961 50.96124
Across the set of testing lengths, the original Box-Cox transformed SARMA model is outperformed by the log-transformed LOWESS model in terms of MAPE, but the STL model is significantly outperformed in short-term prediction. For long-term prediction, it may be wise to use STL, but for short-term prediction, SARMA may be beneficial, as even the STL suggests that prediction using an ARIMA model is the best means of prediction.
Conclusion
The final model is thus:
\((1-\phi B)(1-\Phi B^{12})(\frac{X_{t}^{\lambda}-1}{\lambda}-\mu)=(1+\Theta B^{12})Z_{t}, Z_{t}\sim N(0,\sigma^{2})\)
\(\sigma^{2}=5.446\times 10^{-3}, \mu=2.2496\)
\(\phi=0.6807, \Phi=0.9989, \Theta=-0.9587\)
As mentioned earlier, the primary issue with the SARMA model is its GARCH effects in the residuals. Fitting a GARCH model is outside the scope of this report and may induce overfitting.