Rather than getting too caught up in the complicated abstract issues of long memory models, we can ask the larger question of what we mean when we say that we don’t think returns are predictable but that there is something predictable about volatility? One can construct various complicated schemes to exploit ‘long memory’ in financial markets but it’s good to have a concrete model that is simple and shows us concretely that in fact some sort of long memory does describe volatility in equity markets. Here is a simple approach that fits very well the volatility series if one considers the standard errors of the coefficients of the model: aggregate the volatility by averaging of and fit say an AR(25) model and check whether the coefficient standard deviations are small compared to the coefficients. I did this with 1899 stocks only averaging over stocks with some return information per day. One can consider this series a ‘baseline volatility of the market’ which can be used with a similar sort of idea as Sharpe etc. CAPM model and consider this the ‘volatility of the market’. This is not an optimal model yet because the Box-Ljung test produces residual autocorrelations:

Call: arima(x = daggvol, order = c(25, 1, 0)) Coefficients: ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 -1.6928 -2.2377 -2.6436 -2.9080 -3.0832 -3.1564 -3.1561 -3.1289 s.e. 0.0086 0.0170 0.0257 0.0342 0.0422 0.0495 0.0560 0.0615 ar9 ar10 ar11 ar12 ar13 ar14 ar15 ar16 -3.0279 -2.8894 -2.7299 -2.5196 -2.3180 -2.0912 -1.8674 -1.6586 s.e. 0.0660 0.0696 0.0721 0.0737 0.0742 0.0737 0.0722 0.0696 ar17 ar18 ar19 ar20 ar21 ar22 ar23 ar24 -1.4225 -1.1577 -0.9010 -0.6922 -0.5205 -0.3642 -0.1918 -0.0912 s.e. 0.0661 0.0615 0.0561 0.0496 0.0422 0.0342 0.0257 0.0170 ar25 -0.0211 s.e. 0.0086 sigma^2 estimated as 0.1727: log likelihood = -7268.06, aic = 14588.13 Box-Ljung test data: fit$residuals X-squared = 444.3672, df = 50, p-value < 2.2e-16

Increasing the autoregressive order to AR(35) reduces the Box-Ljung statistic a bit but the residuals are still quite far from white noise.

arima(x = daggvol, order = c(35, 1, 0)) Coefficients: ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 -1.7125 -2.2928 -2.7461 -3.0709 -3.3137 -3.4615 -3.5410 -3.5965 s.e. 0.0086 0.0171 0.0262 0.0353 0.0440 0.0523 0.0601 0.0670 ar9 ar10 ar11 ar12 ar13 ar14 ar15 ar16 -3.5845 -3.5381 -3.4719 -3.3528 -3.2406 -3.1020 -2.9666 -2.8394 s.e. 0.0734 0.0790 0.0840 0.0883 0.0918 0.0947 0.0970 0.0986 ar17 ar18 ar19 ar20 ar21 ar22 ar23 ar24 -2.6787 -2.4801 -2.2766 -2.1090 -1.9616 -1.8100 -1.6188 -1.4672 s.e. 0.0997 0.1001 0.0997 0.0986 0.0969 0.0947 0.0918 0.0883 ar25 ar26 ar27 ar28 ar29 ar30 ar31 ar32 -1.3101 -1.1603 -0.9750 -0.8086 -0.6233 -0.4724 -0.3545 -0.2350 s.e. 0.0840 0.0791 0.0735 0.0671 0.0601 0.0524 0.0441 0.0353 ar33 ar34 ar35 -0.1286 -0.0504 -0.0193 s.e. 0.0262 0.0171 0.0086 sigma^2 estimated as 0.1687: log likelihood = -7108.36, aic = 14288.71 Box-Ljung test data: fit$residuals X-squared = 336.358, df = 50, p-value < 2.2e-16

arima(x = daggvol, order = c(40, 1, 0)) Coefficients: ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 -1.7252 -2.3245 -2.7997 -3.1489 -3.4188 -3.5969 -3.7111 -3.8040 s.e. 0.0086 0.0171 0.0262 0.0354 0.0443 0.0528 0.0607 0.0681 ar9 ar10 ar11 ar12 ar13 ar14 ar15 ar16 -3.8317 -3.8277 -3.8059 -3.7325 -3.6623 -3.5660 -3.4700 -3.3834 s.e. 0.0749 0.0811 0.0867 0.0917 0.0960 0.0997 0.1028 0.1054 ar17 ar18 ar19 ar20 ar21 ar22 ar23 ar24 -3.2630 -3.1062 -2.9424 -2.8161 -2.7114 -2.602 -2.4486 -2.3292 s.e. 0.1075 0.1091 0.1101 0.1105 0.1105 0.110 0.1091 0.1074 ar25 ar26 ar27 ar28 ar29 ar30 ar31 ar32 -2.2019 -2.0803 -1.9204 -1.7726 -1.5992 -1.4493 -1.3219 -1.1801 s.e. 0.1053 0.1028 0.0997 0.0960 0.0918 0.0868 0.0812 0.0750 ar33 ar34 ar35 ar36 ar37 ar38 ar39 ar40 -1.0348 -0.8995 -0.7872 -0.6585 -0.5149 -0.3617 -0.2026 -0.0819 s.e. 0.0682 0.0609 0.0529 0.0443 0.0354 0.0262 0.0171 0.0086 sigma^2 estimated as 0.1659: log likelihood = -6997.09, aic = 14076.18 Box-Ljung test data: fit$residuals X-squared = 518.3599, df = 120, p-value < 2.2e-16

In this case it seems that differencing volatility is not necessary as Dickey-Fuller tests show that the undifferenced series is already stationary and in fact the Ljung-Box statistic is smaller for the AR(40) model without differencing.

Call: arima(x = aggvol, order = c(30, 0, 0)) Coefficients: ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 0.2266 0.0857 0.0877 0.0955 0.0563 0.0719 0.0508 0.0072 0.0553 s.e. 0.0086 0.0088 0.0089 0.0089 0.0089 0.0090 0.0090 0.0090 0.0090 ar10 ar11 ar12 ar13 ar14 ar15 ar16 ar17 ar18 0.0253 0.0127 0.0441 -0.0104 0.0237 -0.0038 -0.0112 0.0293 0.0371 s.e. 0.0090 0.0090 0.0090 0.0090 0.0090 0.0090 0.0090 0.0090 0.0090 ar19 ar20 ar21 ar22 ar23 ar24 ar25 ar26 ar27 0.0069 -0.032 -0.0178 0.0063 0.0427 -0.0328 0.0112 0.0020 0.0475 s.e. 0.0090 0.009 0.0090 0.0090 0.0090 0.0090 0.0090 0.0089 0.0089 ar28 ar29 ar30 intercept -0.0022 0.0388 -0.0095 -9.3699 s.e. 0.0089 0.0089 0.0086 0.0619 sigma^2 estimated as 0.1589: log likelihood = -6707.81, aic = 13479.62 Box-Ljung test data: fit$residuals X-squared = 422.1478, df = 120, p-value < 2.2e-16

## Leave a Reply