Feeds:
Posts

## THE CORE OF WHAT IS PREDICTABLE IN THE EQUITY MARKETS AND HOW ARFIMA MODEL IS JUST NOT RIGHT FOR IT

What was Benoit Mandelbrot catching on with bringing over Hurst’s long memory issues into finance in the late 1960s? There is some sort of predictability with long memory but we know that it’s not price but volatility. Here is probably the most concrete manifestation of the predictability — we aggregate volatility (as defined by $\log(r_t^2)$ across around 1900 stocks and then we consider an autoregressive AR(25) model. What is interesting in the fit is the tightness of the error bands around the coefficients in the fit, followed by reasonable quality of fit by Ljung-Box tests up to lag 24.

Call:
arima(x = daggvol, order = c(25, 0, 0))

Coefficients:
ar1      ar2      ar3      ar4      ar5      ar6      ar7      ar8
-0.7690  -0.6770  -0.5880  -0.4868  -0.4283  -0.3528  -0.3016  -0.2931
s.e.   0.0087   0.0109   0.0124   0.0134   0.0140   0.0145   0.0148   0.0150
ar9     ar10     ar11     ar12     ar13     ar14     ar15     ar16
-0.2348  -0.2050  -0.1900  -0.1436  -0.1518  -0.1273  -0.1265  -0.1361
s.e.   0.0152   0.0153   0.0154   0.0154   0.0154   0.0154   0.0154   0.0153
ar17     ar18     ar19     ar20     ar21     ar22     ar23     ar24
-0.1019  -0.0599  -0.0499  -0.0758  -0.0878  -0.0738  -0.0204  -0.0439
s.e.   0.0152   0.0150   0.0148   0.0145   0.0140   0.0134   0.0124   0.0109
ar25  intercept
-0.0209      0e+00
s.e.   0.0087      5e-04

sigma^2 estimated as 0.162:  log likelihood = -6835.13,  aic = 13724.26

Box-Ljung test

data:  fit\$residuals
X-squared = 0.357, df = 1, p-value = 0.5502

Box-Ljung test

data:  fit\$residuals
X-squared = 0.6231, df = 5, p-value = 0.9869

Box-Ljung test

data:  fit\$residuals
X-squared = 1.4813, df = 10, p-value = 0.999

Box-Ljung test

data:  fit\$residuals
X-squared = 12.8747, df = 20, p-value = 0.8827

Box-Ljung test

data:  fit\$residuals
X-squared = 17.5686, df = 21, p-value = 0.6761

Box-Ljung test

data:  fit\$residuals
X-squared = 23.5535, df = 22, p-value = 0.3711

Box-Ljung test

data:  fit\$residuals
X-squared = 30.7678, df = 23, p-value = 0.1286

Box-Ljung test

data:  fit\$residuals
X-squared = 39.2385, df = 24, p-value = 0.02578

Now we consider on the same volatility series the fit of the long-memory ARFIMA model which is more parsimonious in parameters but we point out the failure of the ARFIMA model to remove correlations by lag 10 in the Box-Ljung test. What is happening here is that the long memory correlation structure assumed by ARFIMA is not fitting that of the data and therefore ARFIMA model is not removing the correlations of the residuals. This is a fairly significant problem since in a sense ARFIMA developed on its strength on volatility models.

[1] "arfima fitted parameters"
[1] 5.934141e-05
ar1
0.05780775
ma1        ma2
0.82995621 0.04315417

Box-Ljung test

data:  longmemofit\$residuals
X-squared = 0.5242, df = 1, p-value = 0.469

Box-Ljung test

data:  longmemofit\$residuals
X-squared = 13.2274, df = 5, p-value = 0.02134

Box-Ljung test

data:  longmemofit\$residuals
X-squared = 33.4131, df = 10, p-value = 0.0002321

Box-Ljung test

data:  longmemofit\$residuals
X-squared = 113.7028, df = 20, p-value = 4.108e-15

Box-Ljung test

data:  longmemofit\$residuals
X-squared = 137.1573, df = 21, p-value < 2.2e-16

Box-Ljung test

data:  longmemofit\$residuals
X-squared = 143.7774, df = 22, p-value < 2.2e-16

Box-Ljung test

data:  longmemofit\$residuals
X-squared = 148.3861, df = 23, p-value < 2.2e-16

Box-Ljung test

data:  longmemofit\$residuals
X-squared = 181.226, df = 24, p-value < 2.2e-16

The code for this analysis is here.

library(e1071)
library(forecast)
library(tseries)

ntimes<-dim(P)[1]
nassets<-dim(P)[2]-1

numnonna<-rep(0,ntimes)
aggvol<-rep(0,ntimes)

indivvol1<-rep(0,ntimes)
indivvol2<-rep(0,ntimes)
for (k in 2:nassets){
r<-diff(log(P[,k]))
for (j in 2:ntimes){
if (!is.na(r[j-1])){
v<-log(r[j-1]^2+1e-6)
if (abs(v)<100){
numnonna[j]<- numnonna[j]+1
aggvol[j]<-aggvol[j] + v
if (k==345 ){
indivvol1[j]<-v
}
if (k==1354){
indivvol2[j]<-v
}
}
}
}
}

dindivvol1<-diff(indivvol1)
dindivvol2<-diff(indivvol2)

for (j in 1:length(aggvol)){
aggvol[j]<-aggvol[j]/numnonna[j]
}

aggvol[is.na(aggvol)]<-log(1e-6)
indivvol1[is.na(indivvol1)]<-log(1e-6)
indivvol2[is.na(indivvol2)]<-log(1e-6)

daggvol<-diff(aggvol)
write.csv(daggvol,'daggvol.csv')

fit<-arima( daggvol, order=c(25,0,0))
print(fit)

Box.test( fit\$residuals,lag=1, type="Ljung-Box")
Box.test( fit\$residuals,lag=5, type="Ljung-Box")
Box.test( fit\$residuals,lag=10, type="Ljung-Box")
Box.test( fit\$residuals,lag=20, type="Ljung-Box")
Box.test( fit\$residuals,lag=21, type="Ljung-Box")
Box.test( fit\$residuals,lag=22, type="Ljung-Box")
Box.test( fit\$residuals,lag=23, type="Ljung-Box")
Box.test( fit\$residuals,lag=24, type="Ljung-Box")

longmemofit<-arfima(daggvol,estim="mle",max.order=50)
print('arfima fitted parameters')
print(longmemofit\$d)
print(longmemofit\$ar)
print(longmemofit\$ma)

Box.test( longmemofit\$residuals,lag=1, type="Ljung-Box")
Box.test( longmemofit\$residuals,lag=5, type="Ljung-Box")
Box.test( longmemofit\$residuals,lag=10, type="Ljung-Box")
Box.test( longmemofit\$residuals,lag=20, type="Ljung-Box")
Box.test( longmemofit\$residuals,lag=21, type="Ljung-Box")
Box.test( longmemofit\$residuals,lag=22, type="Ljung-Box")
Box.test( longmemofit\$residuals,lag=23, type="Ljung-Box")
Box.test( longmemofit\$residuals,lag=24, type="Ljung-Box")