1 Introduction
Since their introduction in Engle (1982) and Bollerslev (1986), Generalised AutoRegressive Conditional Heteroskedasticity (GARCH) processes have been widely employed in financial econometrics, see e.g. Bollerslev, Russell, and Watson (2010). In their original formulation, the conditional distribution of innovations was typically assumed to be Gaussian.
Empirically, the distribution of stock returns has been studied extensively under the random walk assumption, see e.g. Fama (1965); in this literature, Gaussianity of stock returns has been questioned as too thintailed when compared to its empirical counterpart. Gaussian GARCH processes can generate uncorrelated, heteroskedastic returns with a stationary distribution with fatter tails than the Gaussian.
GARCH processes can include several lags of the past squared shocks and several lags of the past volatility; in practice, however, the GARCH(1,1) model with is often found to offer a good fit for asset returns, and it is usually preferred to GARCH models with more parameters, see Tsay (2010) section 3.5, or Andersen, Bollerslev, Christoffersen, and Diebold (2006), section 3.6. Moreover, many multivariate GARCH models are built on the univariate GARCH(1,1), see e.g. Engle, Ledoit, and Wolf (2017) and references therein. In this sense the GARCH(1,1) is both the prototype and the workhorse of GARCH processes in practice.
GARCH processes map shocks, i.e. news, into the conditional volatility; the function obtained by replacing past conditional volatilities with unconditional ones was called by Engle and Ng (1993) the newsimpactcurve (NIC). For GARCH(1,1) processes, this curve yields the same value of volatility for positive and negative shocks, i.e. it is symmetric. Glosten, Jagannanthan, and Runkle (1993) (henceforth GJR) extended the GARCH setup to allow for asymmetric news impact curve responses to negative shocks.
Many measures of risk are functions of the prediction density of asset returns. These measures include the Value at Risk, which is a quantile of the prediction distribution of the asset return, see
Jorion (2006), as well as the Expected Shortfall, see Patton, Ziegel, and Chen (2017). The latter is the expected value of the prediction distribution of the asset return in the left tail, between minus infinity and the Value at Risk; this measure has been recently reemphasised by the Third Basel Accords. Both measures are functionals of the prediction distribution of asset returns, see Arvanitis, Hallam, Post, and Topaloglou (2018).The prediction distribution of a GARCH(1,1) hence plays an important role for the computation of risk measures in financial applications. This distribution is not known in analytic form beyond the onestepahead prediction distribution, which is given by the assumption on innovations used to build the process, see e.g. Andersen, Bollerslev, Christoffersen, and Diebold (2006), page 811.
The unknown analytic form of the prediction density of a GARCH has led econometricians to look for alternative approximate solutions. Alexander, Lazar, and Stanescu (2013)
have resorted to approximations based on the first 4 moments of the prediction distribution; see also
Baillie and Bollerslev (1992). They use the CornishFisher expansion and the Johnson SU distribution with the same 4 moments.An alternative to this approach is to simulate from the prediction distribution and to estimate it nonparametrically, e.g. by kernel methods. While consistent, this estimator has the slower rate of convergence typical of kernel density estimators. More recently
Delaigle, Meister, and Rombouts (2016) have proposed a nonparametric root consistent estimator of the stationary distribution of the (log)volatility process. Despite a better convergence rate, the nonparametric estimation of the density requires computing time, and does not lead to exact results.The tail behavior of the stationary distribution of the GARCH(1,1) has been studied extensively, see Mikosch and Starica (2000) and Davis and Mikosch (2009). The tails of the stationary distribution of both the volatility and of the GARCH process are of Pareto type, say. These properties are based on results for random difference equations and renewal theory obtained in Kesten (1973) and Goldie (1991).
The tail index is associated with the number of moments of the stationary distribution, which exist up to order . Larger values of are associated with thinner tails of the stationary distribution; this is interpreted here to mean that the larger the number of moments (i.e. the larger
) the smaller the distance from the Gaussian distribution, which has an infinite number of moments.
depends on the coefficient and of the GARCH(1,1) process , , as well as on the type of the onestepahead distribution. Examples of values of the tail index are given in Davis and Mikosch (2009).The present paper derives the analytical form of the stepahead prediction density of a GARCH(1,1), allowing for the GJR type with asymmetric NIC. Closed form expressions are given for the prediction density of a GARCH(1,1) process for Gaussian innovations. The results are obtained by marginalizing the joint density of the prediction observations, using integration and special functions, for any prediction horizon .^{1}^{1}1The onestep ahead distribution for is given by construction of the process. The formulae are valid for stationary as well as nonstationary GARCH(1,1) processes.
In the case of 2stepsahead, the prediction distribution is obtained without imposing constrains on the values of the and coefficients. For the stepsahead prediction distribution with , a condition on is required to guarantee integrability of a certain integral; a sufficient condition for this to be satisfied is to have larger than 0.62, which is a condition often satisfied in practice.
The prediction density is found to be close to a Gaussian density (with appropriate variance) for high values of
, and far from it for low values of it. Similarly, large values of are found to be associated to higher values of , i.e. smaller distance from the Gaussian distribution for the stationary distribution with Pareto tails .The rest of the paper is organised as follows. Section 2 describes the general approach for the derivation of the integral. Section 3 states main results, while Section 4 discusses the form of the prediction density when compared with the tails of the stationary distribution. Section 5 concludes. The Appendix contains proofs.
2 The prediction density
This section illustrates the construction used to characterise the prediction density as an integral, involving (a product of several copies of) the chosen density of innovations. Consider the asymmetric GARCH(1,1)
(2.1) 
where , and is the indicator function for event , and is the sign of or ; these signs are the same because . The sequence is assumed to be i.i.d., centered around zero and with Gaussian p.d.f. .
Time is taken to be the starting time of the forecasts, and it is assumed that one wishes to predict for some , conditional on information set at time , taken to consist of observations of and . This information set is consistent with observing from minus infinity to time 0 under stationarity. Note also that, because and are observed, also is observed.
Throughout the paper the values taken by the random variables
, , are denoted , and respectively, and sometimes the subscript is omitted if this does not cause ambiguity. The next Lemma reports consequences of the symmetry of the onestepahead density on relevant conditional p.d.f.s. In the Lemma, the following notation is used, , ; here , denote values of and .Lemma 2.1 (Densities).
For symmetric , is symmetric, i.e. , , and it is given by
(2.2) 
Moreover, and one has
(2.3) 
where depends on the value of and the sign of for via (2.1).
Denote the set of all possible by , . Densities are first computed conditionally on and later they are marginalized with respect to it. Here, conditioning on is relevant only for the GJR case .
The basic building block is given by the expression in (2.3). This density can be marginalised with respect to as follows
(2.4) 
Finally, the conditioning with respect to the signs is averaged across different configurations, using the mutual independence of the signs and the fact that for all , thanks to the symmetry of . One hence finds
(2.5) 
where the sum is over , for . The prediction density is hence found by combining (2.5), (2.4), (2.3), (2.2).
The next Lemma reports a recursion for the volatility process, that turns out to be useful when solving the integral in (2.4). In the Lemma, for , let and , where denotes a value of .
Lemma 2.2 (Volatility and transformations).
The volatility process can also be written
For , has the following recursive expression in terms of ’s
(2.6) 
with , which is measurable with respect to the information set at time . Moreover, one has
(2.7) 
where .
3 Main results
The main results are summarised in Theorem 3.2 below. Before stating the main theorems, an auxiliary assumption is introduced. Define with and .
Assumption 3.1.

For , let ;

For let .
It can be noted that , as . In Figure 1, the area above the curve represents the set for .
In Theorem 3.2 below, is the confluent hypergeometric function of the second kind, also known as Tricomi function, see Abadir (1999) and Gradshteyn and Ryzhik (2007), section 9.21, whose integral representation is,
(3.1) 
with Moreover, the following notation is used in summations:
Theorem 3.2 (GARCH(1,1) prediction density).
Proof.
See Appendix.
Note that in the case when , equation (3.2) holds for any value of , while for it holds if and only if . For , the validity of the (3.2) is guaranteed by the sufficient condition , which is, however, not necessary.
The line of proof of Theorem 3.2 is the following: for the integral is solved by substitution and by using equation (3.1). For , subsequent (negative) binomial expansions of expression (2.6) for are required, whose validity is ensured by the inequality
which is satisfied under Assumption 3.1, see Lemma 5.1 in the Appendix.
Immediate consequences of Theorem 3.2 are collected in the following corollary.
Corollary 3.3 (C.d.f. and moments).
Note that in the moments calculations are made of finite sums extending to , involving the Tricomi functions, which do not fall in the logarithmic case as in Theorem 3.2; see Abadir (1999) for the logarithmic case. In fact, implies that
is a finite sum.
Some standardised densities of and the corresponding right tails are plotted in Fig. 2 for . The curve is the standard Gaussian. Figure 3 shows the predictive densities for and values of that range from to 8.5 () to 1/8.5 (). Figure 4 shows the tails for asymmetric news impact curves.
One can see that the prediction densities are more similar to a Gaussian when is large.
4 Stationary distribution
The limit representation of the random variable in the stationary case can be found in Francq and Zakoian (2010) Theorem 2.1 page 24. The tail behaviour of the limit distribution is reviewed in Mikosch and Starica (2000) and Davis and Mikosch (2009). The tails of the stationary distribution of both the volatility and of the GARCH process are of Pareto type, say, where is a tail index. These properties are based on results for random difference equations and renewal theory obtained in Kesten (1973) and Goldie (1991).
The tail index of the stationary distribution depends on the coefficient and of the GARCH(1,1) process as well as on the onestepahead distribution. Examples of the tail index are given in Davis and Mikosch (2009); for Gaussian innovations, for , while for .
The index is the unique solution of . When is an integer, the expression simplifies to
(4.1) 
see Davis and Mikosch (2009) eq. (10). Substituting the moments from the distribution, and assigning values to over a grid of prespecified values, one can solve (4.1) for , and hence for . This allows to compute (values of) the surface . Figure 5 reports the level curves of as a function of and obtained in this way. The figure also reports the lines where is constant. It is seen that, for large values of , and increase roughly together. This association is not present for small values of .
The relation between and fattailedness of the prediction density for finite horizon can be illustrated using the case . From Theorem 3.2,
where and^{2}^{2}2The quantity can be interpreted as the minimum value that can take, in the ideal case when (thus ) and is given, i.e. .
Hence when one has with , see Abramowitz and Stegun (1964), eq. 13.1.8, so that all the Tricomi functions , for varying , tend to one.^{3}^{3}3This is unlike in the case for fixed where the sequence of is decreasing from 1 to 0 for increasing . As a result, when the prediction distribution converges to a .
Hence in both the case of the prediction density for and the stationary distribution, the fat tailedness of the distributions is small for large values of .
5 Conclusions
This paper presents the analytical form of the prediction density of a GARCH(1,1) process. This can be used to evaluate the probability of tail events or of quantities that may be of interest for value at risk calculations. This improves on approximation methods based on moments, or on Monte Carlo simulation and estimation.
The techniques in this paper can ge applied also with symmetric innovations density different from the N(0,1) one. Different densities imply distinct subsequent (negative) binomial expansions of expression (2.6) for , and different auxiliary convergence conditions on the GARCH coefficients, similarly to Assumption 3.1.
References
 (1)
 Abadir (1999) Abadir, K. M. (1999) An introduction to hypergeometric functions for economists. Econometric Reviews, 18(3), 287–330.
 Abramowitz and Stegun (1964) Abramowitz, M., and I. Stegun (1964) Handbook of mathematical functions. National Bureau of Standards, Applied Mathematics.
 Alexander, Lazar, and Stanescu (2013) Alexander, C., E. Lazar, and S. Stanescu (2013) Forecasting VaR using analytic higher moments for GARCH processes. International Review of Financial Analysis, 30, 36 – 45.
 Andersen, Bollerslev, Christoffersen, and Diebold (2006) Andersen, T., T. Bollerslev, P. F. Christoffersen, and F. X. Diebold (2006) Volatility and correlation forecasting. in Handbook of Economic Forecasting, Volume 1, ed. by G. Elliott, C. W. Granger, and A. Timmermann. Elsevier.
 Arvanitis, Hallam, Post, and Topaloglou (2018) Arvanitis, S., M. Hallam, T. Post, and N. Topaloglou (2018) Stochastic Spanning. Journal of Business & Economic Statistics, 0(0), 1–13.
 Baillie and Bollerslev (1992) Baillie, T. R., and T. Bollerslev (1992) Prediction in dynamic models with timedependent conditional variances. Journal of Econometrics, 52, 91–113.
 Bollerslev (1986) Bollerslev, T. (1986) Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics, 31, 307–327.
 Bollerslev, Russell, and Watson (2010) Bollerslev, T., J. R. Russell, and M. W. e. Watson (2010) Volatility and time series econometrics: essays in honor of Robert F. Engle. Oxford University Press.
 Davis and Mikosch (2009) Davis, R., and T. Mikosch (2009) Extreme Value Theory for GARCH Processes. in Handbook of Financial Time Series, ed. by T. Andersen, R. Davis, J.P. Kreiss, and T. Mikosch, pp. 187–200. Springer.
 Delaigle, Meister, and Rombouts (2016) Delaigle, A., A. Meister, and J. Rombouts (2016) Root consistent density estimation in {GARCH} models. Journal of Econometrics, 192(1), 55 – 63.
 Engle (1982) Engle, R. (1982) Autoregressive conditional heteroskedasticity with estimates of the variance of the United Kindom inflation. Econometrica, 11, 122–150.
 Engle and Ng (1993) Engle, R., and V. Ng (1993) Measuring and Testing the Impact of News on Volatility. Journal of Finance, 48, 1749–1778.
 Engle, Ledoit, and Wolf (2017) Engle, R. F., O. Ledoit, and M. Wolf (2017) Large Dynamic Covariance Matrices. Journal of Business & Economic Statistics, 0(0), 1–13.
 Fama (1965) Fama, E. F. (1965) The Behavior of StockMarket Prices. The Journal of Business, 38(1), 34–105.
 Francq and Zakoian (2010) Francq, C., and J.M. Zakoian (2010) GARCH models. Wiley.
 Glosten, Jagannanthan, and Runkle (1993) Glosten, L., R. Jagannanthan, and D. Runkle (1993) On the relation between expected value and the volatility of the nominal excess return on stocks. Journal of Finance, 48, 1779–1802.
 Goldie (1991) Goldie, C. M. (1991) Implicit Renewal Theory and Tails of Solutions of Random Equations. Annals of Applied Probability, 1, 126–166.
 Gradshteyn and Ryzhik (2007) Gradshteyn, I., and I. Ryzhik (2007) Book of Tables of integrals, series, and products. 7th ed., Academic Press.
 Jorion (2006) Jorion, P. (2006) Value at Risk  The New Benchmark for Managing Financial Risk. McGraw Hill, New York.
 Kesten (1973) Kesten, H. (1973) Random difference equations and renewal theory for products of random matrices. Acta Mathematica, 131, 207–248.
 Mikosch and Starica (2000) Mikosch, T., and C. Starica (2000) Limit Theory for the Sample Autocorrelations and Extremes of a GARCH (1,1) process. Annals of Statistics, 28(5), 1427–1451.
 Mood, Graybill, and Boes (1974) Mood, A. M., F. A. Graybill, and D. C. Boes (1974) Introduction to the Theory of Statistics, 3rd Edition. Mc GrawHill.
 Patton, Ziegel, and Chen (2017) Patton, A. J., J. F. Ziegel, and R. Chen (2017) Dynamic Semiparametric Models for Expected Shortfall (and ValueatRisk). ArXiv, https://arxiv.org/abs/1707.05108v1.
 Tsay (2010) Tsay, R. S. (2010) Analysis of financial time series. Wiley, 3rd edn.
Appendix
Proof of Lemma 2.1.
Consider the transformation theorem for ; from standard results, see e.g. Mood, Graybill, and Boes (1974), page 201, Example 19, one has
where is the indicator function of the event . Because, by symmetry, one has , the expression in the previous display simplifies to or, letting indicate , and solving for , one finds , which is (2.2). Note that the expression with the absolute value is also valid for . This proves (2.2).
Eq. (2.3) follows from definitions.
Proof of Lemma 2.2.
Proof of Theorem 3.2
Lemma 5.1 (Conditions on ).
Let Assumption 3.1.b hold. Then, for any
(5.1) 
Proof.
of Lemma 5.1. For the inequality (5.1) reads . Solving the quadratic on the l.h.s. for one finds two roots, and , so that the quadratic is nonnegative for or for . Because , this holds only when . This proves that (5.1) is valid for for and a fortiori also for .
An induction approach is used for . Assume that (5.1) is valid for some and ; it can then be shown that (5.1) is valid also replacing with . To see this, take (5.1) for and multiply by . One finds
Because , one has , so that,
Rearranging as , one finds that (5.1) holds also for . The induction step hence proves that (5.1) holds for any if .
Lemma 5.2 (Coefficients ).
Assume that holds for ; then
(5.2) 
equals as defined in .
Proof of Lemma 5.2.
Rewrite (5.2) setting as
(5.3) 
Using equation (2.6), for this expression equals where
where , , , . Hence
which follows from equation (3.1). This shows that for , for .
Next consider the case , where
and one wishes to expand . Consider the inequality , and the associated quadratic equation in with solutions and as in Assumption 1. One has that for one finds , which ensure that . Hence for one can expand as
Comments
There are no comments yet.