1 Introduction
Modelling conditional mean and volatility dynamics together is of extreme importance in econometrics and finance. A myriad of specifications have been proposed for the purpose, and among them, the double autoregressive (DAR) model has recently been attracting much attention in the literature, and it is defined as
(1.1) 
where , ,
is a sequence of independent and identically distributed (i.i.d.) random variables with zero mean and unit variance, and
is independent of . Model (1.1) was first termed by Ling (2004), and it is a subclass of ARMAARCH models in Weiss (1984) and of nonlinear AR models in Cline and Pu (2004), but it is different from Engle’s ARCH model if some .As was shown in Ling (2007a), model (1.1) has an important feature that its Gaussian quasimaximum likelihood estimator (GQMLE) is asymptotically normal as long as has a finite fractional moment, while the ARMAGARCH model (see, e.g., Ling (2007b) and Zhang and Ling (2015)) does not. This feature makes model (1.1) feasible and convenient to fit the often observed heavytailed data in applications, but it relies on a crucial assumption that each volatility coefficient has a positive lower bound, which might result in the overparameterization problem. Moreover, both the conditional mean and volatility specifications in model (1.1) have the same order . This could be another shortcoming of model (1.1) and narrow down its applications. Motivated by these facts, this paper considers an augmented DAR (ADAR) model of order :
(1.2) 
where all notations are inherited from model (1.1) except that , and the conditional mean and volatility specifications can have different orders and . With these exceptions, we are able to cope with the overparameterization problem by checking whether some coefficients are significant from zero in model (1.2). However, this makes the statistical inference of model (1.2) nonstandard, since the volatility coefficient is allowed to lie on the boundary of the parameter space (see, e.g., Gouriéroux et al. (1982), Andrews (1999, 2001), Francq and Zakoïan (2007, 2009), Iglesias and Linton (2007), Cavaliere et al. (2017) and Pedersen (2017)). Also, when is allowed to be zero, Francq and Zakoïan (2007) has demonstrated that the GQMLE of the ARCH model (i.e., model (1.2) with and ) requires a finite sixth moment of for its asymptotics, and this makes the GQMLE of model (1.2) deficient to handle the heavytailed data with an infinite sixth moment in many circumstances.
This paper contributes to the literature in three aspects. First, a selfweighted GQMLE (SGQMLE) is proposed for model (1.2
) and its limiting distribution is shown to be a projection of a normal vector onto a convex cone by a quadratic approximation. Based on this SGQMLE, the Wald, Lagrange multiplier and quasilikelihood ratio tests are constructed to examine the nullity of some coefficients; their limiting distributions are established under both null and local alternative hypotheses, and their power performance is investigated under local alternative hypotheses. As a special interest, testing for the null hypothesis of one coefficient equaling to zero is also studied. By allowing for the null volatility coefficients, the estimation and testing based on the GQMLE for the conditional variance models have been well known for their nonstandard asymptotics (see, e.g.,
Andrews (1999, 2001), Francq and Zakoïan (2007, 2009) and Pedersen (2017)), but fewer attempts have been made to study their asymptotics in the presence of the conditional mean structure. Our study on the SGQMLE and its related tests for model (1.2) fills this gap. Interestingly, we find that even when the null volatility coefficients exist, the SGQMLE of the conditional mean parameter in model (1.2) is always asymptotically normal, and this property generally does not hold for the ARMAGARCH model. Hence, if we only examine the nullity of the conditional mean coefficients in model (1.2), the Wald, Lagrange multiplier and quasilikelihood ratio tests can be implemented with standard asymptotics. In contrast, when the volatility coefficients are included for nullity examination, the asymptotics of these three tests become nonstandard. In view of this important feature of model (1.2), we can use these three tests to first detect the nullity of the conditional mean coefficients by standard asymptotics, and then detect the nullity of the volatility coefficients by nonstandard asymptotics. We shall emphasize that the preceding twostep procedure is not applicable for the ARMAGARCH model in general, since the distribution of the GQMLE of their conditional mean parameter is indeed nonstandard caused by the null volatility coefficients.Second, motivated by Wong and Ling (2005), we propose a new mixed portmanteau test to check the adequacy of model (1.2). Diagnostic checking for model adequacy is important in time series analysis. The seminal work in Ljung and Box (1978) constructed a portmanteau test for the conditional mean model, and later a similar portmanteau test was developed for the volatility model in Li and Mak (1994). Both portmanteau tests and their many variants have the standard chisquared limiting null distribution; see, e.g., Zhu (2016) and references therein. When the null volatility coefficients are allowed in model (1.2
), we find that our mixed portmanteau test has a nonstandard limiting null distribution, which is not the standard chisquared distribution any more. This result is new to the literature, and it reveals that the null volatility coefficients have a nonignorable effect on the model diagnostic checking. To implement our mixed portmanteau test in practice, we shall apply the Wald, Lagrange multiplier and quasilikelihood ratio tests to obtain a reduced ADAR model with all positive volatility coefficients, and then use the standard chisquared limiting null distribution for our mixed test.
Third, our entire statistical inference procedure aforementioned is valid as long as is stationary, and hence it can have a wide applicable scope in dealing with the heavytailed data. Heavytailedness is often observed in many empirical data (see, e.g., Rachev (2003), Hill (2015) and Zhu and Ling (2015)). When the null volatility coefficients exist in ARCHtype models, the statistical inference methods in Francq and Zakoïan (2007, 2009) and Pedersen (2017) require to have a finite sixth moment. In contrast, our entire methodologies have no moment restriction on resulting from the use of the SGQMLE, which is motivated by the selfweighting technique in Ling (2005). The selfweighting technique is necessary only when has an infinite sixth moment, and its idea is to apply the selfweight functions to reduce the effect of leverage data so that no moment condition of is needed. We emphasize that the ARMAGARCH model with the SGQMLE in Ling (2007b) is also applicable to the heavytailed data. However, the asymptotics of the SGQMLE in Ling (2007b) do not allow null volatility coefficients, and hence no statistical inference method is proposed in the presence of the null volatility coefficients. Finally, the importance of our entire methodologies is illustrated by simulation studies and one real example.
The remainder of the paper is organized as follows. Section 2 presents the SGQMLE and establishes its asymptotics. Section 3 constructs three tests to test for the null coefficients and obtains their asymptotics. Section 4 analyzes the power of these three tests. Section 5 proposes a portmanteau test for the model diagnostic checking. Simulation results are reported in Section 6, and one real example is given in Section 7. Technical proofs of all theorems are relegated to Appendices.
Throughout the paper, is the transpose of a matrix , is the Frobenius norm of a matrix , for any is the inner product induced by a positive definite matrix , is the norm of , is the c.d.f. of standard normal random variable, is the indicator function, and denotes the convergence in distribution.
2 Selfweighted Gaussian quasimaximum likelihood estimation
Let be the unknown parameter of model (1.2), where , and . Let . Assume that the observations are generated from model (1.2) with the true value , where and . Given the observations , the selfweighted Gaussian quasimaximum likelihood estimator (SGQMLE) of is , which is defined as
(2.1) 
where is the parameter space, is the selfweighted function with being a measurable real positive and bounded function on , and
(2.2) 
with and . Particularly, when , the SGQMLE reduces to the classical GQMLE in Ling (2004).
To obtain the asymptotic properties of , the following four assumptions are needed.
Assumption 2.1.
is strictly stationary and ergodic.
Assumption 2.2.
The parameter space is compact with , , , , and , , where , , , , and are all finite positive constants.
Assumption 2.3.
, where .
Assumption 2.4.
The matrix is positive definite, where and .
We offer some remarks on the aforementioned assumptions. Assumption 2.1 is a mild setting for time series models. When , a sufficient and necessary condition for Assumption 2.1 was obtained in Borkovec and Klüppelberg (2001) and Chen et al. (2014). When , a sufficient yet complicated condition for Assumption 2.1 is available in Ling (2007a).
Assumption 2.2 allows the volatility coefficient to be zero. In Ling (2004, 2007a), each is required to have a positive lower bound so that the GQMLE only needs a finite fractional moment of for its asymptotic normality. However, the requirement that for each is stringent and could cause the trouble of overparameterization. Under Assumption 2.2, the overparameterization problem can be solved, but as a tradeoff, the asymptotic distribution of the GQMLE becomes nonstandard and requires (see, e.g., Francq and Zakoïan (2007) and Pedersen (2017)). In applications, the finiteness of could be restrictive for two reasons. First, this moment condition does not allow us to deal with many heavytailed data. Second, this moment condition gives us a small admissible parameter space. As a simple illustration, we consider a DAR() model:
(2.3) 
When , Table 1 gives the constraints on the parameter for strict stationarity, the 2nd, 4th, and 6th moments of , and Fig 1 displays these constraints graphically. From this figure, we can see that the region of 6th moment is much smaller than that of strict stationarity. Hence, it is practically important to release the moment condition of so that the admissible parameter space is enlarged as much as possible.
Condition  Constraint  Region in Fig 1 

Strict stationarity  I+II+III+IV  
II+III+IV  
III+IV  
IV 
Assumption 2.3 plays a key role in releasing the moment condition of . When , it is valid without the weight (i.e., ). When , the weight is introduced to reduce the effect of leverage points by shrinking their weights on the objective function so that no moment condition of is needed but at the sacrifice of efficiency. This idea was initiated by Ling (2005), and it has been adopted in many studies; see, e.g., Ling (2005, 2007b), Pan et al. (2007), Francq and Zakoïan (2010), Zhu and Ling (2011, 2015) and Yang and Ling (2017). In practice, the selection of is similar to that of the influence function in Huber (1996). For example, we can follow Horváth and Liese (2004) to choose
(2.4) 
or we can follow Ling (2005) to choose
(2.5) 
where for some constant , and is chosen as the 90% or 95% percentile of , empirically. However, when the second moment of does not exist, the 95% empirical percentile of might be very large, leading to malfunction of (2.5). Therefore, we prefer to use in (2.4) subsequently, but leave the selection of the optimal as an open problem.
Assumption 2.4 is general to derive the asymptotic distribution of . As shown in Wilkins (1944), this assumption is equivalent to that for any , which is satisfied for continuous .
Next, let and with
We are ready to give our first main result on the consistency and asymptotic distribution of .
Theorem 2.1.
almost surely (a.s.) as ;
if Assumption 2.4 further holds, as ,
where and with for , and
Theorem 2.1 implies that when is not an interior point of (i.e., some of its volatility coefficients are on the boundary), the limiting distribution of is no longer Gaussian but a projection of a Gaussian random variable onto the convex cone with a metric induced by the inner product . The uniqueness of such a projection is guaranteed by the convexity of . Particularly, when is an interior point of , we have and then .
Write and . In view of that is a block diagonal matrix, by Theorem 4 in Andrews (1999), it is not hard to see that
(2.6) 
where , , , and . The result (2.6) implies that is always asymptotically normal, no matter whether the null volatility coefficients exist. This important feature guarantees that we can examine the significance of the conditional mean coefficients by using the standard statistical inference methods, and then implement the statistical inference for the significance of the volatility coefficients. The validity of this twostep procedure is mainly because the matrix is block diagonal, and it does not need to be block diagonal, allowing and to be asymptotically correlated. Note that the matrix is the expectation of the Hessian matrix of the objective function. For the ARMAGARCH model, the corresponding matrix is not diagonal in general, and hence the GQMLE of the conditional mean parameter may not be asymptotically normal if the null volatility coefficients exist.
3 Testing for null coefficients
In this section, we consider the Wald, Lagrange multiplier (LM) and quasilikelihood ratio (QLR) tests to detect whether some coefficients are equal to zero in model (1.2). Since the significance of the conditional mean coefficients can be examined ahead, we only focus on the tests for the null volatility coefficients.
We split the true parameter into three parts such that , where for , and . Without loss of generality, we assume , and . That is, contains the conditional mean coefficients as well as volatility coefficients that are strictly larger than zero, and contains all volatility coefficients on the boundary. Note that from now on, the order of components in and is changed according to the splitting way of . Let and . Our null hypothesis is set as
Under , the nuisance coefficients vector allows on the boundary. This setting is similar to that in Pedersen (2017), and more general than that in Francq and Zakoïan (2009), which only considers the case of .
To construct our test statistics, the following notations are needed:
where and . With these notations, we denote
where is the restricted SGQMLE under . Our Wald, LM and QLR test statistics are defined as
respectively, and their limiting null distributions are given in the following theorem.
Theorem 3.1.
;
;
,
Theorem 3.1 shows that except for , the limiting null distributions of and are not intuitive. The test has standard chisquared limiting null distribution, since its limiting distribution depends on that of the score , which is asymptotically normal under (see (A.6) in Appendix A). On the contrary, the tests and have nonstandard limiting null distributions, since their limiting distributions rely on that of , however, is not asymptotically normal under as shown in Theorem 2.1(ii).
By letting
(3.2) 
be the estimators of and in (3.1), we propose an algorithm, which is similar to Algorithm 1 in Pedersen (2017), to calculate the critical values of and in practice.
Algorithm 3.1.
Simulated critical values of and

Draw from and then compute .

Find that minimizes for and that minimizes for .

Calculate and .

Repeat steps 13 times to get and , where and are the realizations of and in time, respectively. At the level , the critical values of and are the empirical sample percentiles of and based on and , respectively.
To implement Algorithm 3.1, we need know and . Hence, Algorithm 3.1 is only applicable when either (i.e., no nuisance parameters on the boundary exist under ) or with a known (i.e., the true nuisance parameters on the boundary are known under ). If with an unknown , so far it is unclear how to obtain the critical values of and . In practice, we recommend to use both and by presuming . If is rejected in this case, we then get a reduced ADAR model with coefficients , from which we could get the supportive evidence of if our tests imply that all coefficients are nonzeros.
Besides the simulation method in Algorithm 3.1, one may use the “ out of bootstrap” method as in Politis and Romano (1994) and Andrews (2000) to compute the critical values of and . Although this bootstrap method is valid in theory, its performance could be sensitive to the choice of the subsampling size (as demonstrated by our unreported simulation results), while how to choose the optimal remains unsolved in our time series setting. Based on this argument, we recommend to use Algorithm 3.1 for convenience.
In many situations, we are mostly interested in testing the nullity of one volatility coefficient:
Besides the Wald, LM and QLR tests, a type test statistic is often used in practice to detect , where
and is the square root of the th diagonal element of . Particularly, when , we have .
When with a known , we can apply Algorithm 3.1 to find the critical values of , and . When , , and have simpler critical regions, which are closely related to standard normal and chisquared distributions.
Theorem 3.2.
The preceding theorem demonstrates that when the coefficient lies on the boundary of parameter space, the standard critical regions of , and are not correct, except . In other words, if we follow to use the conventional critical regions for , and , we would encounter a distorted size problem for the latter three tests.
4 Power analysis
This section studies the efficiency of the Wald, LM and QLR tests via the Pitman analysis.
First, we need the asymptotic distribution of the SGQMLE under sequences of local alternatives to the true parameter . Let , where with and such that for sufficiently large . When is sufficiently large, we can define a strictly stationary solution to
where is defined as in model (1.2). Based on , the SGQMLE is
(4.1) 
where
with and . Below, we impose a stronger sufficient assumption on the selfweighted function to derive the asymptotics of .
Assumption 2.3.
for some , where .
Denote by the law of . Then, we have the following result.
Theorem 4.1.
We emphasize that since the loglikelihood ratio is not asymptotically normal in our setting, it seems difficult to apply the classical Le Cam’s third lemma to prove Theorem 4.1(ii); see Francq and Zakoïan (2009) for more discussions. In this paper, we show Theorem 4.1(ii) in a direct way.
Let be the noncentral chisquared distribution with noncentrality parameter
. The asymptotic distributions of all three test statistics under the local alternatives are given as follows.For all of four tests in Theorem 3.2, the following theorem shows that the local asymptotic power of the type, Wald and QLR tests is the same, and it is higher than the one of LM test.
Theorem 4.3.
Suppose that the conditions in Theorem 4.1(ii) hold and . Then, the local asymptotic power of the type, Wald and QLR tests is
and the local asymptotic power of LM test is
where , and .
In addition to the Pitman analysis, the Bahadur slopes as in Bahadur (1960) under fixed alternatives are also established for three test statistics in the supplementary material (Jiang et al. (2019)). However, as in Francq and Zakoïan (2009), a formal comparison of Bahadur slopes for all considered tests is not easy, since , , and are unknown in closed form, particularly when the selfweighted function is included.
5 Model checking
This section proposes a new portmanteau test to check the adequacy of model (1.2). Define the selfweighted innovation and the selfweighted squared innovation . Accordingly, define the selfweighted residual