Non-standard inference for augmented double autoregressive models with null volatility coefficients

05/06/2019 ∙ by Feiyu Jiang, et al. ∙ Tsinghua University The University of Hong Kong 0

This paper considers an augmented double autoregressive (DAR) model, which allows null volatility coefficients to circumvent the over-parameterization problem in the DAR model. Since the volatility coefficients might be on the boundary, the statistical inference methods based on the Gaussian quasi-maximum likelihood estimation (GQMLE) become non-standard, and their asymptotics require the data to have a finite sixth moment, which narrows applicable scope in studying heavy-tailed data. To overcome this deficiency, this paper develops a systematic statistical inference procedure based on the self-weighted GQMLE for the augmented DAR model. Except for the Lagrange multiplier test statistic, the Wald, quasi-likelihood ratio and portmanteau test statistics are all shown to have non-standard asymptotics. The entire procedure is valid as long as the data is stationary, and its usefulness is illustrated by simulation studies and one real example.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Modelling conditional mean and volatility dynamics together is of extreme importance in econometrics and finance. A myriad of specifications have been proposed for the purpose, and among them, the double autoregressive (DAR) model has recently been attracting much attention in the literature, and it is defined as


where , ,

is a sequence of independent and identically distributed (i.i.d.) random variables with zero mean and unit variance, and

is independent of . Model (1.1) was first termed by Ling (2004), and it is a subclass of ARMA-ARCH models in Weiss (1984) and of nonlinear AR models in Cline and Pu (2004), but it is different from Engle’s ARCH model if some .

As was shown in Ling (2007a), model (1.1) has an important feature that its Gaussian quasi-maximum likelihood estimator (GQMLE) is asymptotically normal as long as has a finite fractional moment, while the ARMA-GARCH model (see, e.g., Ling (2007b) and Zhang and Ling (2015)) does not. This feature makes model (1.1) feasible and convenient to fit the often observed heavy-tailed data in applications, but it relies on a crucial assumption that each volatility coefficient has a positive lower bound, which might result in the over-parameterization problem. Moreover, both the conditional mean and volatility specifications in model (1.1) have the same order . This could be another shortcoming of model (1.1) and narrow down its applications. Motivated by these facts, this paper considers an augmented DAR (ADAR) model of order :


where all notations are inherited from model (1.1) except that , and the conditional mean and volatility specifications can have different orders and . With these exceptions, we are able to cope with the over-parameterization problem by checking whether some coefficients are significant from zero in model (1.2). However, this makes the statistical inference of model (1.2) non-standard, since the volatility coefficient is allowed to lie on the boundary of the parameter space (see, e.g., Gouriéroux et al. (1982), Andrews (1999, 2001), Francq and Zakoïan (2007, 2009), Iglesias and Linton (2007), Cavaliere et al. (2017) and Pedersen (2017)). Also, when is allowed to be zero, Francq and Zakoïan (2007) has demonstrated that the GQMLE of the ARCH model (i.e., model (1.2) with and ) requires a finite sixth moment of for its asymptotics, and this makes the GQMLE of model (1.2) deficient to handle the heavy-tailed data with an infinite sixth moment in many circumstances.

This paper contributes to the literature in three aspects. First, a self-weighted GQMLE (S-GQMLE) is proposed for model (1.2

) and its limiting distribution is shown to be a projection of a normal vector onto a convex cone by a quadratic approximation. Based on this S-GQMLE, the Wald, Lagrange multiplier and quasi-likelihood ratio tests are constructed to examine the nullity of some coefficients; their limiting distributions are established under both null and local alternative hypotheses, and their power performance is investigated under local alternative hypotheses. As a special interest, testing for the null hypothesis of one coefficient equaling to zero is also studied. By allowing for the null volatility coefficients, the estimation and testing based on the GQMLE for the conditional variance models have been well known for their non-standard asymptotics (see, e.g.,

Andrews (1999, 2001), Francq and Zakoïan (2007, 2009) and Pedersen (2017)), but fewer attempts have been made to study their asymptotics in the presence of the conditional mean structure. Our study on the S-GQMLE and its related tests for model (1.2) fills this gap. Interestingly, we find that even when the null volatility coefficients exist, the S-GQMLE of the conditional mean parameter in model (1.2) is always asymptotically normal, and this property generally does not hold for the ARMA-GARCH model. Hence, if we only examine the nullity of the conditional mean coefficients in model (1.2), the Wald, Lagrange multiplier and quasi-likelihood ratio tests can be implemented with standard asymptotics. In contrast, when the volatility coefficients are included for nullity examination, the asymptotics of these three tests become non-standard. In view of this important feature of model (1.2), we can use these three tests to first detect the nullity of the conditional mean coefficients by standard asymptotics, and then detect the nullity of the volatility coefficients by non-standard asymptotics. We shall emphasize that the preceding two-step procedure is not applicable for the ARMA-GARCH model in general, since the distribution of the GQMLE of their conditional mean parameter is indeed non-standard caused by the null volatility coefficients.

Second, motivated by Wong and Ling (2005), we propose a new mixed portmanteau test to check the adequacy of model (1.2). Diagnostic checking for model adequacy is important in time series analysis. The seminal work in Ljung and Box (1978) constructed a portmanteau test for the conditional mean model, and later a similar portmanteau test was developed for the volatility model in Li and Mak (1994). Both portmanteau tests and their many variants have the standard chi-squared limiting null distribution; see, e.g., Zhu (2016) and references therein. When the null volatility coefficients are allowed in model (1.2

), we find that our mixed portmanteau test has a non-standard limiting null distribution, which is not the standard chi-squared distribution any more. This result is new to the literature, and it reveals that the null volatility coefficients have a non-ignorable effect on the model diagnostic checking. To implement our mixed portmanteau test in practice, we shall apply the Wald, Lagrange multiplier and quasi-likelihood ratio tests to obtain a reduced ADAR model with all positive volatility coefficients, and then use the standard chi-squared limiting null distribution for our mixed test.

Third, our entire statistical inference procedure aforementioned is valid as long as is stationary, and hence it can have a wide applicable scope in dealing with the heavy-tailed data. Heavy-tailedness is often observed in many empirical data (see, e.g., Rachev (2003), Hill (2015) and Zhu and Ling (2015)). When the null volatility coefficients exist in ARCH-type models, the statistical inference methods in Francq and Zakoïan (2007, 2009) and Pedersen (2017) require to have a finite sixth moment. In contrast, our entire methodologies have no moment restriction on resulting from the use of the S-GQMLE, which is motivated by the self-weighting technique in Ling (2005). The self-weighting technique is necessary only when has an infinite sixth moment, and its idea is to apply the self-weight functions to reduce the effect of leverage data so that no moment condition of is needed. We emphasize that the ARMA-GARCH model with the S-GQMLE in Ling (2007b) is also applicable to the heavy-tailed data. However, the asymptotics of the S-GQMLE in Ling (2007b) do not allow null volatility coefficients, and hence no statistical inference method is proposed in the presence of the null volatility coefficients. Finally, the importance of our entire methodologies is illustrated by simulation studies and one real example.

The remainder of the paper is organized as follows. Section 2 presents the S-GQMLE and establishes its asymptotics. Section 3 constructs three tests to test for the null coefficients and obtains their asymptotics. Section 4 analyzes the power of these three tests. Section 5 proposes a portmanteau test for the model diagnostic checking. Simulation results are reported in Section 6, and one real example is given in Section 7. Technical proofs of all theorems are relegated to Appendices.

Throughout the paper, is the transpose of a matrix , is the Frobenius norm of a matrix , for any is the inner product induced by a positive definite matrix , is the norm of , is the c.d.f. of standard normal random variable, is the indicator function, and denotes the convergence in distribution.

2 Self-weighted Gaussian quasi-maximum likelihood estimation

Let be the unknown parameter of model (1.2), where , and . Let . Assume that the observations are generated from model (1.2) with the true value , where and . Given the observations , the self-weighted Gaussian quasi-maximum likelihood estimator (S-GQMLE) of is , which is defined as


where is the parameter space, is the self-weighted function with being a measurable real positive and bounded function on , and


with and . Particularly, when , the S-GQMLE reduces to the classical GQMLE in Ling (2004).

To obtain the asymptotic properties of , the following four assumptions are needed.

Assumption 2.1.

is strictly stationary and ergodic.

Assumption 2.2.

The parameter space is compact with , , , , and , , where , , , , and are all finite positive constants.

Assumption 2.3.

, where .

Assumption 2.4.

The matrix is positive definite, where and .

We offer some remarks on the aforementioned assumptions. Assumption 2.1 is a mild setting for time series models. When , a sufficient and necessary condition for Assumption 2.1 was obtained in Borkovec and Klüppelberg (2001) and Chen et al. (2014). When , a sufficient yet complicated condition for Assumption 2.1 is available in Ling (2007a).

Assumption 2.2 allows the volatility coefficient to be zero. In Ling (2004, 2007a), each is required to have a positive lower bound so that the GQMLE only needs a finite fractional moment of for its asymptotic normality. However, the requirement that for each is stringent and could cause the trouble of over-parameterization. Under Assumption 2.2, the over-parameterization problem can be solved, but as a trade-off, the asymptotic distribution of the GQMLE becomes non-standard and requires (see, e.g., Francq and Zakoïan (2007) and Pedersen (2017)). In applications, the finiteness of could be restrictive for two reasons. First, this moment condition does not allow us to deal with many heavy-tailed data. Second, this moment condition gives us a small admissible parameter space. As a simple illustration, we consider a DAR() model:


When , Table 1 gives the constraints on the parameter for strict stationarity, the 2nd, 4th, and 6th moments of , and Fig 1 displays these constraints graphically. From this figure, we can see that the region of 6th moment is much smaller than that of strict stationarity. Hence, it is practically important to release the moment condition of so that the admissible parameter space is enlarged as much as possible.

Condition Constraint Region in Fig 1
Strict stationarity I+II+III+IV
Table 1: Parameter constraints for model
Figure 1: Regions of strict stationarity (I+II+III+IV), 2nd moment (II+III+IV), 4th moment (III+IV) and 6th moment (IV) of in model (2.3).

Assumption 2.3 plays a key role in releasing the moment condition of . When , it is valid without the weight (i.e., ). When , the weight is introduced to reduce the effect of leverage points by shrinking their weights on the objective function so that no moment condition of is needed but at the sacrifice of efficiency. This idea was initiated by Ling (2005), and it has been adopted in many studies; see, e.g., Ling (2005, 2007b), Pan et al. (2007), Francq and Zakoïan (2010), Zhu and Ling (2011, 2015) and Yang and Ling (2017). In practice, the selection of is similar to that of the influence function in Huber (1996). For example, we can follow Horváth and Liese (2004) to choose


or we can follow Ling (2005) to choose


where for some constant , and is chosen as the 90% or 95% percentile of , empirically. However, when the second moment of does not exist, the 95% empirical percentile of might be very large, leading to malfunction of (2.5). Therefore, we prefer to use in (2.4) subsequently, but leave the selection of the optimal as an open problem.

Assumption 2.4 is general to derive the asymptotic distribution of . As shown in Wilkins (1944), this assumption is equivalent to that for any , which is satisfied for continuous .

Next, let and with

We are ready to give our first main result on the consistency and asymptotic distribution of .

Theorem 2.1.

Suppose that Assumptions 2.1-2.3 hold. Then,

almost surely (a.s.) as ;

if Assumption 2.4 further holds, as ,

where and with for , and

Theorem 2.1 implies that when is not an interior point of (i.e., some of its volatility coefficients are on the boundary), the limiting distribution of is no longer Gaussian but a projection of a Gaussian random variable onto the convex cone with a metric induced by the inner product . The uniqueness of such a projection is guaranteed by the convexity of . Particularly, when is an interior point of , we have and then .

Write and . In view of that is a block diagonal matrix, by Theorem 4 in Andrews (1999), it is not hard to see that


where , , , and . The result (2.6) implies that is always asymptotically normal, no matter whether the null volatility coefficients exist. This important feature guarantees that we can examine the significance of the conditional mean coefficients by using the standard statistical inference methods, and then implement the statistical inference for the significance of the volatility coefficients. The validity of this two-step procedure is mainly because the matrix is block diagonal, and it does not need to be block diagonal, allowing and to be asymptotically correlated. Note that the matrix is the expectation of the Hessian matrix of the objective function. For the ARMA-GARCH model, the corresponding matrix is not diagonal in general, and hence the GQMLE of the conditional mean parameter may not be asymptotically normal if the null volatility coefficients exist.

3 Testing for null coefficients

In this section, we consider the Wald, Lagrange multiplier (LM) and quasi-likelihood ratio (QLR) tests to detect whether some coefficients are equal to zero in model (1.2). Since the significance of the conditional mean coefficients can be examined ahead, we only focus on the tests for the null volatility coefficients.

We split the true parameter into three parts such that , where for , and . Without loss of generality, we assume , and . That is, contains the conditional mean coefficients as well as volatility coefficients that are strictly larger than zero, and contains all volatility coefficients on the boundary. Note that from now on, the order of components in and is changed according to the splitting way of . Let and . Our null hypothesis is set as

Under , the nuisance coefficients vector allows on the boundary. This setting is similar to that in Pedersen (2017), and more general than that in Francq and Zakoïan (2009), which only considers the case of .

To construct our test statistics, the following notations are needed:

where and . With these notations, we denote

where is the restricted S-GQMLE under . Our Wald, LM and QLR test statistics are defined as

respectively, and their limiting null distributions are given in the following theorem.

Theorem 3.1.

Suppose that Assumptions 2.1-2.4 hold. Then, under , as ,




where and are defined as in Theorem 2.1,


with . Particularly, when , .

Theorem 3.1 shows that except for , the limiting null distributions of and are not intuitive. The test has standard chi-squared limiting null distribution, since its limiting distribution depends on that of the score , which is asymptotically normal under (see (A.6) in Appendix A). On the contrary, the tests and have non-standard limiting null distributions, since their limiting distributions rely on that of , however, is not asymptotically normal under as shown in Theorem 2.1(ii).

By letting


be the estimators of and in (3.1), we propose an algorithm, which is similar to Algorithm 1 in Pedersen (2017), to calculate the critical values of and in practice.

Algorithm 3.1.

Simulated critical values of and

  1. Draw from and then compute .

  2. Find that minimizes for and that minimizes for .

  3. Calculate and .

  4. Repeat steps 1-3 times to get and , where and are the realizations of and in time, respectively. At the level , the critical values of and are the empirical sample percentiles of and based on and , respectively.

To implement Algorithm 3.1, we need know and . Hence, Algorithm 3.1 is only applicable when either (i.e., no nuisance parameters on the boundary exist under ) or with a known (i.e., the true nuisance parameters on the boundary are known under ). If with an unknown , so far it is unclear how to obtain the critical values of and . In practice, we recommend to use both and by presuming . If is rejected in this case, we then get a reduced ADAR model with coefficients , from which we could get the supportive evidence of if our tests imply that all coefficients are non-zeros.

Besides the simulation method in Algorithm 3.1, one may use the “ out of bootstrap” method as in Politis and Romano (1994) and Andrews (2000) to compute the critical values of and . Although this bootstrap method is valid in theory, its performance could be sensitive to the choice of the subsampling size (as demonstrated by our un-reported simulation results), while how to choose the optimal remains unsolved in our time series setting. Based on this argument, we recommend to use Algorithm 3.1 for convenience.

In many situations, we are mostly interested in testing the nullity of one volatility coefficient:

Besides the Wald, LM and QLR tests, a -type test statistic is often used in practice to detect , where

and is the square root of the th diagonal element of . Particularly, when , we have .

When with a known , we can apply Algorithm 3.1 to find the critical values of , and . When , , and have simpler critical regions, which are closely related to standard normal and chi-squared distributions.

Theorem 3.2.

Suppose that Assumptions 2.1-2.4 hold and . Then, the -type, Wald, LM and QLR tests of asymptotic significance level for are defined by the critical regions

respectively, where , and and are defined in (3.2).

The preceding theorem demonstrates that when the coefficient lies on the boundary of parameter space, the standard critical regions of , and are not correct, except . In other words, if we follow to use the conventional critical regions for , and , we would encounter a distorted size problem for the latter three tests.

4 Power analysis

This section studies the efficiency of the Wald, LM and QLR tests via the Pitman analysis.

First, we need the asymptotic distribution of the S-GQMLE under sequences of local alternatives to the true parameter . Let , where with and such that for sufficiently large . When is sufficiently large, we can define a strictly stationary solution to

where is defined as in model (1.2). Based on , the S-GQMLE is



with and . Below, we impose a stronger sufficient assumption on the self-weighted function to derive the asymptotics of .

Assumption 2.3.

for some , where .

Denote by the law of . Then, we have the following result.

Theorem 4.1.

Suppose that Assumptions 2.1-2.2 and 2.4 hold. Then, under ,

if Assumption 2.3 holds with ,

in probability as


if Assumption 2.3 holds with , as , where

with being defined as in Theorem 2.1.

We emphasize that since the log-likelihood ratio is not asymptotically normal in our setting, it seems difficult to apply the classical Le Cam’s third lemma to prove Theorem 4.1(ii); see Francq and Zakoïan (2009) for more discussions. In this paper, we show Theorem 4.1(ii) in a direct way.

Let be the noncentral chi-squared distribution with noncentrality parameter

and degrees of freedom

. The asymptotic distributions of all three test statistics under the local alternatives are given as follows.

Theorem 4.2.

Suppose that the conditions in Theorem 4.1(ii) hold and . Then, under , as ,



where and are defined in (3.1), is defined as in Theorem 4.1, and

with being defined as in Theorem 2.1.

For all of four tests in Theorem 3.2, the following theorem shows that the local asymptotic power of the -type, Wald and QLR tests is the same, and it is higher than the one of LM test.

Theorem 4.3.

Suppose that the conditions in Theorem 4.1(ii) hold and . Then, the local asymptotic power of the -type, Wald and QLR tests is

and the local asymptotic power of LM test is

where , and .

In addition to the Pitman analysis, the Bahadur slopes as in Bahadur (1960) under fixed alternatives are also established for three test statistics in the supplementary material (Jiang et al. (2019)). However, as in Francq and Zakoïan (2009), a formal comparison of Bahadur slopes for all considered tests is not easy, since , , and are unknown in closed form, particularly when the self-weighted function is included.

5 Model checking

This section proposes a new portmanteau test to check the adequacy of model (1.2). Define the self-weighted innovation and the self-weighted squared innovation