M-estimation in GARCH models without higher order moments

We consider a class of M-estimators of the parameters of the GARCH models which are asymptotically normal under mild assumptions on the moments of the underlying error distribution. Since heavy-tailed error distributions without higher order moments are common in the GARCH modeling of many real financial data, it becomes worthwhile to use such estimators for the time series inference instead of the quasi maximum likelihood estimator. We discuss the weighted bootstrap approximations of the distributions of M-estimators. Through extensive simulations and data analysis, we demonstrate the robustness of the M-estimators under heavy-tailed error distributions and the accuracy of the bootstrap approximation. In addition to the GARCH (1, 1) model, we obtain extensive computation and simulation results which are useful in the context of higher order models such as GARCH (2, 1) and GARCH (1, 2) but have not yet received sufficient attention in the literature. Finally, we use M-estimators for the analysis of three real financial time series fitted with GARCH (1, 1) or GARCH (2, 1) models.

Authors

• 42 publications
• 2 publications
12/13/2019

R-estimators in GARCH models; asymptotics, applications and bootstrapping

The quasi-maximum likelihood estimation is a commonly-used method for es...
06/21/2021

Spliced Binned-Pareto Distribution for Robust Modeling of Heavy-tailed Time Series

This work proposes a novel method to robustly and accurately model time ...
02/02/2020

Detecting Anomalous Time Series by GAMLSS-Akaike-Weights-Scoring

An extensible statistical framework for detecting anomalous time series ...
09/09/2020

Identification and estimation of Structural VARMA models using higher order dynamics

We use information from higher order moments to achieve identification o...
07/28/2019

A Higher-Order Swiss Army Infinitesimal Jackknife

Cross validation (CV) and the bootstrap are ubiquitous model-agnostic to...
09/14/2020

Robust discrete choice models with t-distributed kernel errors

Models that are robust to aberrant choice behaviour have received limite...
02/19/2021

Truncated, Censored, and Actuarial Payment-type Moments for Robust Fitting of a Single-parameter Pareto Distribution

With some regularity conditions maximum likelihood estimators (MLEs) alw...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The Generalized autoregressive conditional heteroscedastic (GARCH) models have been used extensively to analyze the volatility or the instantaneous variability of a financial time series

. A series is said to follow a GARCH  model if

 Xt=σtϵt, (1.1)

where are unobservable i.i.d. errors with symmetric distribution around zero and

 σt=(ω0+p∑i=1α0iX2t−i+q∑j=1β0jσ2t−j)1/2,t∈Z, (1.2)

with , , . Mukherjee (2008) proposed a class of M-estimators for estimating the GARCH parameter

 \boldmathθ0=(ω0,α01,…,α0p,β01,…,β0q)′ (1.3)

based on observations . The M-estimators are asymptotically normal under some moment assumptions on the error distribution and are more robust than the commonly-used quasi maximum likelihood estimator (QMLE). Mukherjee (2020) considered a class of weighted bootstrap methods to approximate the distributions of these estimators and established the asymptotic validity of such bootstrap. In this paper, we apply an iteratively re-weighted algorithm to compute the M-estimates and the corresponding bootstrap estimates with specific attention to Huber’s, - and Cauchy-estimates which were not considered in the literature in details. The iteratively re-weighted algorithm turns out to be particularly useful in computing bootstrap replicates since it avoids the re-computation of some core quantities for new bootstrap samples.

The class of M-estimators includes the QMLE. The asymptotic normality and the asymptotic validity of bootstrapping the QMLE were derived under the finite fourth moment assumption on the error distribution. However, there are other M-estimators such as the -estimator and Cauchy-estimator which are asymptotic normal under mild assumption on the finiteness of lower order moments. Since heavy-tailed error distributions without higher order moments are common in the GARCH modeling of many real financial time series, it becomes worthwhile to use these estimators for such series but unfortunately they have not been investigated in the literature. One of the contributions of this paper is to reveal precisely the importance of such alternative M-estimators to analyze financial data instead of using the QMLE.

In an earlier work, Muler and Yohai (2008) analyzed the Electric Fuel Corporation (EFCX) time series and fitted a GARCH (1, 1) model. Using exploratory analysis, they detected presence of outliers and considered estimation of parameters based on robust methods. It turned out that estimates based on different methods vary widely and so it is difficult to assess which method should be relied on in similar situations. In this paper, we use M-estimates with mild assumptions on error moments to analyze the EFCX series.

Francq and Zakoïan (2009) underscored the importance of using higher order GARCH models such as GARCH (2, 1) for some real financial time series but the computation and simulation results for such models are not available widely in the literature. We investigate the role of M-estimators for the GARCH (2, 1) model through extensive simulations and real data analysis. We also provide simulation results and analysis for the GARCH (1, 2) model.

The paper is organized as follows. Sections 2 and 3 set the background. In particular, we discuss the class of M-estimators and give examples in Section 2. Section 3 contains bootstrap formulation and the statement on the asymptotic validity of the bootstrap. Section 4 discusses computational aspects of M-estimators and its bootstrapped replicates. Section 5 reports simulation results for various M-estimators. Section 6 compares bootstrap approximation with the asymptotic normal approximation to distributions of M-estimators through simulation. Section 7 analyzes three real financial time series.

2 M-estimators of the GARCH parameters

Throughout this paper, for a function , we use to denote its derivative whenever it exists. Also, . For , . Moreover, will denote a generic r.v. having same distribution as errors of (1.1).

Let

be an odd function which is differentiable at all but finite number of points. Let

denote the set of points where is differentiable and let denote its complement. Let , so that is symmetric. The function is called the score function of the M-estimation in the scale model. Examples are as follows.

Example 1. QMLE score: Let . Then .

Example 2. LAD score: Let . Then and .

Example 3. Huber’s score: Let , where is a known constant. Then and .

Example 4. Score function for the maximum likelihood estimation (MLE): Let , where is the true density of , assumed to be known. Then .

Example 5. score: Let , where is a known constant. Then and is bounded.

Example 6. Cauchy score: Let . Then is bounded.

Example 7. Score function for the exponential pseudo-maximum likelihood estimation: Let , where and are known constants. Here and .

Assume that for some and ,

 E[|ϵ|κ1]<∞andlimt→0P[ϵ2

Then of (1.2) has the following unique almost sure representation:

 σ2t=c0+∞∑j=1cjX2t−j,t∈Z, (2.2)

where are defined in (2.9)-(2.16) of Berkes et al. (2003).

Let be a compact subset of . A typical element in is denoted by

. Define the variance function on

by

 vt(\boldmathθ)=c0(\boldmathθ)+∞∑j=1cj(\boldmathθ)X2t−j,\boldmathθ∈\boldmathΘ,t∈Z, (2.3)

where the coefficients are given in Berkes et al. (2003) (Section 3, and display (3.1)) with the property

 cj(\boldmathθ0)=cj,∀j≥0. (2.4)

Hence the variance functions satisfy , . Using (2.4), (1.1) can be rewritten as

 Xt={vt(\boldmathθ0)}1/2ϵt,1≤t≤n. (2.5)

Consider observable approximation of the process of (2.3) defined by

 ^vt(\boldmathθ)=c0(\boldmathθ)+I(2≤t)t−1∑j=1cj(\boldmathθ)X2t−j,% \boldmathθ∈\boldmathΘ,1≤t≤n. (2.6)

Then an M-estimator is defined as the solution of , where

 ˆ\boldmathMn,H(\boldmathθ):=n∑t=1{1−H{Xt/^v1/2t(\boldmathθ)}}{˙^\boldmathvt(\boldmathθ)/^vt(\boldmathθ)}. (2.7)

Next we describe the iterative relation of that is used to write computer program for their numerical evaluation. The computation is discussed in Section 4.

Example 1. GARCH model: With ,

 c0(ω,α,β)=ω/(1−β),cj(ω,α,β)=αβj−1,j≥1.

Example 2. GARCH model: With ,

 c0(\boldmathθ)=ω/(1−β),c1(\boldmathθ% )=α1,c2(\boldmathθ)=α2+βc1(% \boldmathθ)=α2+βα1

and

 cj(\boldmathθ)=βcj−1(\boldmathθ),j≥3.

Example 3. GARCH model: With ,

 c0(\boldmathθ)=ω/(1−β1−β2),c1(% \boldmathθ)=α,c2(\boldmathθ)=β1c1(\boldmathθ)=β1α,

and

 cj(\boldmathθ)=β1cj−1(\boldmathθ)+β2cj−2(\boldmathθ),j≥3.

Example 4. GARCH model: With ,

 c0(\boldmathθ)=ω/(1−β1−β2),c1(% \boldmathθ)=α1,c2(\boldmathθ)=α2+β1α1

and

 cj(\boldmathθ)=β1cj−1(\boldmathθ)+β2cj−2(\boldmathθ),j≥3.

2.1 Asymptotic distribution of ^\boldmathθn

The asymptotic distribution of is derived under the following assumptions.

Model assumptions: The parameter space is a compact set and its interior contains both and of (1.3) and (2.10), respectively. Moreover, (2.1), (2.3) and (2.5) hold and is stationary and ergodic.

Conditions on the score function:

Identifiability condition: Corresponding to the score function , there exists a unique number satisfying

 E[H(ϵ/c1/2H)]=1. (2.8)

Moment conditions:

 E[H(ϵ/c1/2H)]2<∞and0

Also various Smoothness conditions on as in Mukherjee (2008) are assumed which are satisfied in all examples of considered above. Define the score function factor

 σ2(H):=4Var{H(ϵ/c1/2H)}/[E{(ϵ/c1/2H)˙H(ϵ/c1/2H)}]2,

the matrix

 \boldmathG:=E{˙\boldmathv1(% \boldmathθ0H)˙\boldmathv′1(\boldmath% θ0H)/v21(\boldmathθ0H)}

and the transformed parameter

 \boldmathθ0H=(cHω0,cHα01,…,cHα0p,β01,…,β0q)′. (2.10)
Theorem 2.1.

Suppose that the model assumptions, identifiability condition, moment conditions and smoothness conditions hold. Then

 n1/2(^\boldmathθn−\boldmathθ0H)→N(0,σ2(H)\boldmathG−1). (2.11)

Note that used in above formulas are given by (i) for the QMLE, (ii) for the LAD while for the Huber, -estimator, Cauchy and other scores, does not have closed-form expression. For such score functions, is calculated using (2.8) as follows. We fix a large positive integer and generate from the error distribution considered for the simulation. Then, using the bisection method on , we solve the equation

 (1/I)I∑i=1{H(ϵi/c1/2)}−1=0.

Values of computed in this way were provided in Mukherjee (2008, page 1541) for some error distributions and score functions. In Table 1 we provide for few more error distributions and score functions such as Huber’s -score and -estimator with and which are used in simulations and data analysis of later sections. In the sequel, Double exponential is abbreviated as DE.

3 Bootstrapping M-estimators

Let be a triangular array of r.v.’s such that for each , are exchangeable and independent of the data and errors . Also, , and .

Based on these weights, bootstrap estimate is defined as the solution of , where

 ˆ\boldmathM∗n,H(\boldmathθ):=n∑t=1wnt{1−H{Xt/^v1/2t(\boldmathθ)}}{˙^\boldmathvt(\boldmathθ)/^vt(\boldmathθ)}. (3.1)

Examples. From many different choices of bootstrap weights, we consider the following three schemes for comparison.

(i) Scheme M. The sequence of weights has a multinomial distribution, which is essentially the classical paired bootstrap.

(ii) Scheme E. When , where are i.i.d. exponential r.v. with mean . Under scheme E, is a weighted M-estimator with weights proportional to , .

(iii) Scheme U. When , where are i.i.d. uniform r.v. on . Under scheme U, is a weighted M-estimator with weights proportional to , .

A host of other bootstrap methods in the literature are special cases of the above bootstrap formulation. Such general formulation of weighted bootstrap offers a unified way of studying several bootstrap schemes simultaneously. See, for example, Chatterjee and Bose (2005) for details in other contexts.

We assume that the weights satisfy the following basic conditions (Conditions BW of Chatterjee and Bose (2005)) where and is a constant.

 E(wn1)=1,0

Under (3.2) and some additional smoothness and moment conditions in Mukherjee (2020), weighted bootstrap is asymptotic valid.

Theorem 3.1.

For almost all data, as ,

 σ−1nn1/2(^\boldmathθ∗n−^% \boldmathθn)→N(0,σ2(H)\boldmathG% −1). (3.3)

We remark that since

, the rate of convergence of the bootstrap estimator is the same as that of the original estimator. The standard deviation of the weights

at the denominator of the scaling reflects the contribution of the corresponding weights.

The distributional result of (3.3

) is useful for constructing the confidence interval of the GARCH parameters as follows. Let

denote the number of bootstrap replicates, denote a generic parameter (one of , or ) and let and denote its M-estimator and -th bootstrap estimator (), respectively. Let be one of , or , as appropriate, which has a known value for a simulation experiment. Using the approximation of by , the bootstrap confidence interval of is of the form

 [^γn−n−1/2{σ−1nn1/2(^γ∗n,α/2−^γn)},^γn+n−1/2{σ−1nn1/2(^γ∗n,1−α/2−^γn)}] (3.4)

where is the

-th quantile of the numbers

. Consequently, the bootstrap coverage probability is computed by the proportion of the above set of

confidence intervals containing .

Similarly, using (2.11) of Theorem 3.1, we can obtain the confidence interval of based on the asymptotic normality of , and this will be called the normal confidence interval. Specifically, in view of Proposition 3.1 of Mukherjee (2008) on the estimation of the variance-covariance matrix , we can obtain the asymptotic confidence interval of as

 [^γn−n−1/2^dz1−α/2,^γn+n−1/2^dz1−α/2], (3.5)

where is the estimated variance of obtained from the appropriate diagonal entry of the estimator of and is the

-th quantile of the standard normal distribution.

In the following Section 6, we will compare the accuracy of the confidence intervals constructed by the bootstrap and asymptotic approximations.

4 Algorithm

We discuss the implementation of an iteratively re-weighted algorithm proposed in Mukherjee (2020) for computing M-estimates. In particular, we highlight -estimate and Cauchy-estimate of the GARCH parameters in this paper as their asymptotic distributions are derived under mild moment assumptions. We also consider the bootstrap estimators based on the corresponding score functions.

4.1 Computation of M-estimates

For the convenience of writing, let for . Using a Taylor expansion of , we obtain the following recursive equation for computing the updated estimate of from the current estimate of :

 ~\boldmathθ=\boldmathθ+{˙α(1)/2}−1[n∑t=1˙^\boldmathvt(% \boldmathθ)˙^\boldmathvt(\boldmathθ% )′/^v2t(\boldmathθ)]−1n∑t=1{H{Xt/^v1/2t(\boldmathθ)}−1}{˙^\boldmathvt(\boldmathθ)/^vt(\boldmathθ)}, (4.1)

where under smoothness conditions on . Since the GARCH residuals estimate only , in general, we cannot estimate from the data. Therefore, we use ad hoc techniques such as simulating from or standardized DE distribution and then use to carry out the iteration. Note that if the iteration in (4.1) converges then . Therefore in this case from (4.1), and hence is the desired . Based on our extensive simulation study and real data analysis, the algorithm is robust enough to converge to the same value of irrespective of different values of the unknown factor used in computation.

In the following examples, we discuss (4.1) when specialized to the M-estimators computed in this paper.

QMLE: Here and . Hence and

 ~\boldmathθ =

With

 Wt=1/^v2t(\boldmathθ),xt=˙^% \boldmathvt(\boldmathθ),yt=X2t−^vt(\boldmathθ),

can be computed iteratively as

 ~\boldmathθ(r+1)=~\boldmathθ(r)+{E(ϵ2)}−1{∑tWtxtx′t}−1{∑tWtxtyt}.

Note that when , this is same as the formula obtained through the BHHH algorithm proposed by Berndt et al. (1974).

LAD: Here and . Hence and

 ~\boldmathθ = \boldmathθ+{2/E|ϵ|}[n∑t=1{˙^\boldmathvt(\boldmathθ)˙^\boldmathvt(\boldmathθ)′/^v2t(\boldmathθ)}]−1n∑t=1[|Xt|/^v1/2t(\boldmathθ)−1]{˙^\boldmathvt(\boldmathθ)/^vt(% \boldmathθ)} = \boldmathθ+{2/E|ϵ|}[n∑t=1{˙^\boldmathvt(\boldmathθ)˙^\boldmathvt(\boldmathθ)′/^v2t(\boldmathθ)}]−1n∑t=1{|Xt|−^v1/2t(\boldmathθ)}{˙^\boldmathvt(\boldmathθ)/^v3/2t(% \boldmathθ)} = \boldmathθ+{2/E|ϵ|}[n∑t=1{˙^\boldmathvt(\boldmathθ)˙^\boldmathvt(\boldmathθ)′/^v2t(\boldmathθ)}]−1n∑t=1{^v1/2t(\boldmathθ)(|Xt|−^v1/2t(\boldmathθ))}{˙^\boldmathvt(% \boldmathθ)/^v2t(\boldmathθ)}.

With

 Wt=1/^v2t(\boldmathθ),xt=˙^% \boldmathvt(\boldmathθ),yt=^v1/2t(% \boldmathθ)(|Xt|−^v1/2t(\boldmathθ)),

can be computed iteratively as

 ~\boldmathθ(r+1)=~\boldmathθ(r)+{2/E|ϵ|}{∑tWtxtx′t}−1{∑tWtxtyt}.

Huber: Here and

 α(c)=E[(cϵ)2I(|cϵ|≤k)+k|cϵ|I(|cϵ|>k)].

Hence

 ˙α(1)=E[2ϵ2I(|ϵ|≤k)+k|ϵ|I(|ϵ|>k)]

and

 ~\boldmathθ = \boldmathθ−{˙α(1)/2}−1[n∑t=1{˙^\boldmathvt(% \boldmathθ)˙^\boldmathvt(\boldmathθ% )′^v2t(\boldmathθ)}]−1 ×n∑t=1⎡⎣1−X2t^vt(% \boldmathθ)I⎛⎝|Xt|^v1/2t(\boldmathθ)≤k⎞⎠−k|Xt|^v1/2t(\boldmathθ)I⎛⎝|Xt|^v1/2t(\boldmathθ)>k⎞⎠⎤⎦⎧⎪⎨⎪⎩˙^\boldmathvt(\boldmath% θ)^vt(\boldmathθ)⎫⎪⎬⎪⎭.

With

 Wt=1/^v2t(\boldmathθ),xt=˙^% \boldmathvt(\boldmathθ)

and

 yt=X2tI(|Xt|/^v1/2t(\boldmathθ)≤k)+k|Xt|^v1/2t(\boldmathθ)I(|Xt|/^v1/2t(\boldmathθ)>k)−^vt(\boldmathθ),

can be computed iteratively as

 ~\boldmathθ(r+1)=~\boldmathθ(r)+{˙α(1)/2}−1{∑tWtxtx′t}−1{∑tWtxtyt}.

-estimator: Here and . Hence

 ˙α(1)=μE[|ϵ|/(1+|ϵ|)2]

and

 ~\boldmathθ=\boldmathθ+{μ2E[|ϵ|(1+|ϵ|)2]}−1[n∑t=1{˙^\boldmathvt(% \boldmathθ)˙^\boldmathvt(\boldmathθ% )′^v2t(\boldmathθ)}]−1n∑t=1⎡⎣μ|Xt|^v1/2t(\boldmathθ)+|Xt|−1⎤⎦⎧⎪⎨⎪⎩˙^\boldmathvt(\boldmathθ)^vt(\boldmathθ)⎫⎪⎬⎪⎭.

With

 Wt=1/^v2t(\boldmathθ),xt=˙^% \boldmathvt(\boldmathθ),yt=μ|Xt|^vt(\boldmathθ)^v1/2t(\boldmathθ)+|Xt|−^vt(\boldmathθ),

can be computed iteratively as

Cauchy-estimator: Here and . Hence

 ˙α(1)=E[4ϵ2/(1+ϵ2)2]

and

 ~\boldmathθ=\boldmathθ−{2E[ϵ2(1+ϵ2)2]}−1[n∑t=1{˙^\boldmathvt(\boldmathθ)˙^\boldmathvt(\boldmathθ)′^v2t(\boldmathθ)}]−1n∑t=1[1−2X2t^vt(\boldmathθ)+X2t]⎧⎪⎨⎪⎩˙^\boldmathvt(\boldmathθ)^vt(\boldmathθ)⎫⎪⎬⎪⎭.

With

 Wt=1/^v2t(\boldmathθ),xt=˙^% \boldmathvt(\boldmathθ),yt=2X2t^vt(\boldmathθ)^vt(\boldmathθ)+X2t−^vt(\boldmathθ),

can be computed iteratively as

 ~\boldmathθ(r+1)=~\boldmathθ(r)+{2E[ϵ2(1+ϵ2)2]}−1{∑tWtxtx′t}−1{∑tWtxtyt}.

4.2 Computation of bootstrap M-estimates

Here the relevant function is defined in (3.1) and the bootstrap estimate can be computed using the updating equation

 ~\boldmathθ∗ = \boldmathθ−{2/˙α(1)}[n∑t=1 wnt{˙^\boldmathvt(\boldmathθ)˙^\boldmathvt(\boldmathθ)′/^v2t(\boldmathθ)