Adaptive inference for a semiparametric GARCH model

07/09/2019 ∙ by Feiyu Jiang, et al. ∙ Tsinghua University The University of Hong Kong 0

This paper considers a semiparametric generalized autoregressive conditional heteroscedastic (S-GARCH) model, which has a smooth long run component with unknown form to depict time-varying parameters, and a GARCH-type short run component to capture the temporal dependence. For this S-GARCH model, we first estimate the time-varying long run component by the kernel estimator, and then estimate the non-time-varying parameters in short run component by the quasi maximum likelihood estimator (QMLE). We show that the QMLE is asymptotically normal with the usual parametric convergence rate. Next, we provide a consistent Bayesian information criterion for order selection. Furthermore, we construct a Lagrange multiplier (LM) test for linear parameter constraint and a portmanteau test for model diagnostic checking, and prove that both tests have the standard chi-squared limiting null distributions. Our entire statistical inference procedure not only works for the non-stationary data, but also has three novel features: first, our QMLE and two tests are adaptive to the unknown form of the long run component; second, our QMLE and two tests are easy-to-implement due to their related simple asymptotic variance expressions; third, our QMLE and two tests share the same efficiency and testing power as those in variance target method when the S-GARCH model is stationary.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Since the seminal work of Engle (1982) and Bollerslev (1986), a huge number of conditional heteroscedastic models have been proposed to capture and forecast the volatility of economic and financial return data. Among them, the generalized autoregressive conditional heteroscedastic (GARCH) model is perhaps the most influential one in empirical studies. However, the GARCH model is often used under the stationarity assumption. Due to business cycle, technological progress, preference change and policy switch, the underlying structure of data may change over time (see Hansen (2001)). Hence, a non-stationary GARCH model with time-varying parameters seems more appropriate to fit the return data in many applications; see, e.g., Mikosch and Stǎricǎ (2004), Granger and Stǎricǎ (2005), Engle and Rangel (2008), Fryzlewicz, Sapatinas and Subba Rao (2008), Patilea and Raïssi (2014), Truquet (2017) and the references therein.

This paper is motivated to consider a semiparametric GARCH (S-GARCH) model of order given by


for , where is a positive smoothing deterministic function with unknown form on the interval , is a covariance stationary GARCH process with , and , and

is a sequence of independent and identically distributed (i.i.d) random variables with

. The S-GARCH model was first introduced by Feng (2004). This model is stationary when (a positive constant parameter); otherwise, it is non-stationary. The specification that is a function of ratio rather than time is initiated by Robinson (1989), and since then, it has become a common scaling scheme in the time series literature; see, e.g., Dahlhaus and Subba Rao (2006), Cai (2007), Cavaliere and Taylor (2007), Xu and Phillips (2008), Zhou and Wu (2009), Zhang and Wu (2012), Zhou and Shao (2013), Vogt (2015), and Zhu (2019) to name just a few. Let be the information set up to time . Under (1.1)–(1.2), we have that and . Hence, the time-varying long run component allows the (conditionally) heteroscedastic structure of to change over time, and a GARCH-type short run component further captures the temporal dependence of .

The S-GARCH model makes a balance between generality and parsimony. First, this model nests many existing ones according to different choices of and the order . For instance, when , it becomes the usual covariance stationary GARCH model in Bollerslev (1986); when (i.e., and ), it forms the time-varying variance model in Stărică and Granger (2005); when is an exponential quadratic spline function and , it turns into the spline GARCH model in Engle and Rangel (2008); when , it gives rise to the univariate time-varying GARCH model in Hafner and Linton (2010); and when is a general logistic transition function, it reduces to the smooth-transition GARCH model in Amado and Teräsvirta (2013). By allowing for unspecified form of and higher order , the S-GARCH model is more general to avoid model-misspecification.

Second, the S-GARCH model can be viewed as a parsimonious version of the time-varying GARCH model in Subba Rao (2006) and Chen and Hong (2016). The time-varying GARCH model generalizes the locally stationary ARCH model in Dahlhaus and Subba Rao (2006), and it has the form as follows:


for , where , and are smoothing deterministic functions with unknown form on the interval . Clearly, the S-GARCH model is a special case of model (1.3) with


Although model (1.3) shares great generality in form, its statistical inference is complex caused by a nonparametric estimation of functions, and its prediction performance may not be good due to the so-called “data-snooping bias” (see White (2000)). Therefore, reducing complexity in model (1.3) seems necessary to improve model fit or forecasts accuracy, and the S-GARCH model is designed for this purpose. Truquet (2017) had made similar efforts by proposing a semiparametric time-varying ARCH() model (i.e., some of and are constant functions and in model (1.3)). When , the S-GARCH model is a parsimonious version of Truquet’s model. When , the S-GARCH model can capture the temporal persistence of by including the GARCH terms , while Truquet’s model can not. This difference makes two models incompatible in terms of built in frameworks and inference methodologies. Interestingly, if is Lipschitz continuous, the ratios and in (1.4) are of order , and hence the S-GARCH model can be locally approximated by the following model:


for , where . When , model (1.5) reduces to the one studied by Patilea and Raïssi (2014) and Truquet (2017), where its usefulness was demonstrated via many applications.

This paper aims to provide an entire statistical inference for the S-GARCH() model in (1.1)–(1.2). First, motivated by Feng (2004), we propose a two-step estimation procedure to estimate the nonparametric function

and the unknown parameter vector in the parametric process

. Specifically, we consider a kernel estimator for at step one, and based on the estimates of from step one, we next estimate the unknown parameter vector in by the quasi maximum likelihood estimator (QMLE) at step two. Under the identification condition , we establish the asymptotic normality of both estimators. Moreover, we show the consistency of the Bayesian information criterion for order selection in the S-GARCH model. Our two-step estimation method shares the similar idea as the variance target (VT) estimation method in Francq, Horváth and Zakoïan (2011), which is applicable for the stationary S-GARCH model (i.e., ). The only difference is that our first step estimator of is non-parametric, while the first step estimator of in the VT method is parametric. It turns out that our method requires more involved proof techniques but gives a much broader application scope to handle the non-stationary data.

Via the QMLE, we further construct a new Lagrange multiplier (LM) test to detect whether the parameters and satisfy a linear constraint in the S-GARCH model, and show that this LM test has the chi-squared limiting null distribution. As a special case, our LM test can be used to examine whether some of and are zeros, and this is particularly interesting in many applications. Bollerslev (1986) derived a LM test for this purpose in the stationary GARCH model, however, his LM test is invalid for the non-stationary S-GARCH model. For the S-GARCH model with , the score test in Patilea and Raïssi (2014) can be used to check the nullity of all , and the Wald test in Truquet (2017) can be applied to detect the nullity of some of . However, neither of them is feasible for the S-GARCH model with . On the contrary, our new LM test is easy-to-implement in both cases.

Finally, we develop a new portmanteau test to check the adequacy of the S-GARCH model. Goodness-of-fit testing is important for GARCH model applications, but it has not been attempted under time-varying or semiparametric framework. Our proposed portmanteau test fills this gap by checking whether the squared model residues from the above two-step estimation method is an uncorrelated sequence. This idea resembles the one in Li and Mak (1994), where a portmanteau test was constructed for the stationary GARCH model. See also Escanciano and Lobato (2009), Zhu (2016), Zheng, Li and Li (2017) and references therein on more variants of portmanteau test for the stationary GARCH model. We show that our new portmanteau test has the standard chi-squared limiting distribution as the existing ones, and it is valid for the S-GARCH model with time-varying , which so far has precluded many practical applications under non-stationarity.

Besides the ability to cope with the non-stationary data, our proposed methodologies have three important novel features to make them more appealing in practice. First, the asymptotic variance of our QMLE is adaptive to the unknown form of . Consequently, the efficiency of our QMLE and the power of our LM and portmanteau tests are invariant regardless of the form of . Second, the asymptotic variance of our QMLE is shown to have a much more simple expression than the one in Hafner and Linton (2010), where they studied the same QMLE as us for the S-GARCH() model. Although two asymptotic variances are equivalent, the simple asymptotic variance found by us makes our entire inference procedure easy-to-implement in practice. Third, when the S-GARCH is stationary, our QMLE is asymptotically as efficient as the QMLE in the second step estimation of the VT method, although the first estimator of our method has a slower convergence rate than that of the VT method. On the contrary, when the S-GARCH is non-stationary, our QMLE is still valid with the same efficiency as the stationary case due to its adaption feature, while the QMLE in the VT method is not applicable any more. Hence, compared with the QMLE in the VT method, our QMLE not only has the additional benefits to deal with non-stationary S-GARCH model, but also avoids the efficiency loss in studying the stationary S-GARCH model. For our LM and portmanteau tests, they inherit a similar feature in terms of testing power. To the best of our knowledge, the aforementioned three novel features are unveiled for the first time in the literature, and they are further highlighted by our simulation studies.

The remainder of the paper is organized as follows. Section 2 presents the two-step estimation procedure, establishes its related asymptotics, and studies the order selection. Section 3 gives a LM test for the linear parameter constraint. Section 4 introduces a portmanteau test and obtains its limiting null distribution. Section 5 makes a comparison with other estimation methods. Simulation results are reported in Section 6, and applications are given in Section 7. Concluding remarks are offered in Section 8. Proofs of all theorems are relegated to the Appendix, which can be found in the supplementary document.

The following notations are used throughout the paper. For a square matrix , is its transpose, is its trace, and

is its spectral radius. For a random matrix

, is its norm. Denote be the real line, be the positive real line, be the integer part of ,

be the identity matrix of order

, be the indicator function,

be the convergence in probability, and

be the convergence in distribution.

2 Two-step estimation

Let be the parameter vector in model (1.2), and be its true value, where is the parameter space. This section gives a two-step estimation procedure for the S-GARCH model in (1.1)–(1.2). Our procedure first estimates the nonparametric function in (1.1), and then estimates the parameter vector in (1.2).

2.1 Estimation of

This subsection provides a (Nadaraya-Watson) kernel estimator of . To accomplish it, we first need the following assumption for the identification of :

Assumption 2.1.

; .

Assumption 2.1(i) is equivalent to the covariance stationarity of model (1.2), and Assumption 2.1(ii) is to ensure . Under Assumption 2.1, we have that , from which it is reasonable to estimate by

where with being a kernel function and being a bandwidth. Furthermore, since under mild conditions, it is more convenient to estimate by


To obtain the asymptotic distribution of , the following three assumptions are imposed:

Assumption 2.2.

is twice continuously differentiable; , where and are two positive constants.

Assumption 2.3.

is symmetric about zero, bounded and Lipschitz continuous with and ; and as .

Assumption 2.4.


Assumption 2.2(i) imposes a smoothness condition on , and similar conditions have been used in Feng (2004), Dahlhaus and Subba Rao (2006), Hafner and Linton (2010), and Chen and Hong (2016). If Assumption 2.2(i) is further relaxed to a weaker condition that is Lipschitz continuous as in Fryzlewicz, Sapatinas and Subba Rao (2008) and Truquet (2017), still has the asymptotic normality but with a slower order of its asymptotic bias, which may cause a nonignorable effect on the asymptotics of the estimator of ; see Remark 1 below for more discussions. Assumption 2.2(ii) is in line with the condition that has the positive lower and upper bounds in model (1.3). Assumption 2.3(i) holds for many often used kernels, and the bounded support condition on is just to simplify analysis. Assumption 2.3(ii) requires that converges to zero at a slower rate than , and later a more restrictive is needed for the asymptotics of the estimator of . Assumption 2.4 is stronger than Assumption 2.1(i), and it is used to ensure the asymptotic variance of is well defined.

Let . The asymptotic normality of is given below:

Theorem 2.1.

Suppose Assumptions 2.12.4 hold. Then, for any ,


Remark 1.

If Assumption 2.2(i) is replaced by a weaker condition that is Lipschitz continuous, we can only claim that for any ,

where and are defined implicitly. In this case, the bias term has a slower order . Consequently, it seems challenging to show the bias effect from the estimation of is negligible in the estimation of , whereas this negligibility is key to prove the -convergence of the estimator of . Hence, we resort to Assumption 2.2(i) for technical reason.

Based on in (2.1), we estimate by . In practice, may have the boundary problem. To overcome this, we follow Chen and Hong (2016) to adopt the reflection method proposed by Hall and Wehrly (1991). That is, we generate pseudo data for and for , and then modify as


Intuitively, the reflection method makes the boundary points behave similarly as the interior ones. Similar to Chen and Hong (2016), it can be seen that the reflection method gives a bias term of order , and hence it does not affect the asymptotics of the estimator of . Although in (2.2) is used for numerical calculations, our proofs will be based on in the sequel to ease the presentation.

2.2 Estimation of

This subsection considers the quasi maximum likelihood estimator (QMLE) of . Based on Assumption 2.1(ii), we can write the parametric in (1.2) as


By assuming that , the log-likelihood function (multiplied by negative two and ignoring constants) of is


However, is infeasible for computation, since are unobservable. Therefore, we have to replace by , and consider the following feasible log-likelihood function:


where , and is computed recursively by


with given constant initial values .

Based on in (2.5), our QMLE of is defined as

To establish the asymptotics of , the following additional assumptions are imposed:

Assumption 2.5.

is compact; is an interior point of ; if , the polynomials and have no common root.

Assumption 2.6.

has a continuous and almost surely positive density on with ; for some .

Assumption 2.7.

, where is defined as in Assumption 2.6(ii).

Assumption 2.8.

for some and .

Assumption 2.5 is regular, and it has been used by Horváth and Kokoszka (2003) and Francq and Zakoïan (2004) to study the QMLE for the stationary GARCH model. Assumption 2.6(i) gives the identification condition for based on the QMLE, and ensures that the GARCH process is -mixing (see, e.g., Carrasco and Chen (2002)). Assumption 2.6(ii) is stronger than the condition that , which is necessary to derive the asymptotic normality of the QMLE for the stationary GARCH model; see Hall and Yao (2003). Assumption 2.7 is stronger than Assumption 2.4, and as shown in Francq and Zakoïan (2004), it is not needed for the asymptotic normality of the QMLE in the stationary GARCH model. We resort to the stronger conditions of and in Assumptions 2.6(ii) and 2.7 due to the existence of in the S-GARCH model. Assumption 2.8 requires a more restrictive condition on the bandwidth than Assumption 2.3(ii), and similar conditions have been adopted by Feng (2004), Hafner and Linton (2010), Patilea and Raïssi (2014), and Truquet (2017). The reason is because an undersmoothing is needed to make the estimation bias from negligible so that the -convergence of holds.

Denote , , with , and


Now, we are ready to give the asymptotics of in the following theorem:

Theorem 2.2.

Suppose Assumptions 2.12.3 and 2.52.7 hold. Then,

as ;

furthermore, if Assumption 2.3 is replaced by Assumption 2.8,

where , and and are defined in (2.7).

Remark 2.

We can simply estimate by its sample version , where




Here, with , with , and . Under the conditions of Theorem 2.2, it is not hard to see that as .

Remark 3.

It is worth noting that our proof techniques are different from those in Feng (2004), which seem to be un-rigorous and lead to a wrong asymptotic variance of .

Interestingly, the preceding theorem shows that the asymptotic variance of is independent of . Following the viewpoint of Robinson (1987), it means that is adaptive to the unknown form of . This adaption feature ensures that the efficiency of and the power of its related tests are unchanged regardless of the form of .

In Truquet (2017), a projection-based weighted least squares estimator (WLSE) was proposed for model (1.5) with . However, this projection-based WLSE is not adaptive, and its extension to the case of seems difficult due to the existence of unobservable GARCH terms .

2.3 Order selection

To use the S-GARCH model in practice, we need determine suitable orders and . This subsection studies the Bayesian information criteria (BIC) for this purpose. Based on , we compute (i.e., the QMLE for a given ), and then define the BIC as follows:

where is defined in (2.5). Denote the true values of and as and , respectively. Based on the BIC, our selected order is defined as


The consistency of is given in the following theorem.

Theorem 2.3.

Suppose the conditions in Theorem 2.2 hold. Then,

3 The LM test

Since Engle (1982) and Bollerslev (1986), testing for the nullity of the parameters in the GARCH model is important in applications. This problem can be further generalized to consider the following linear constraint hypothesis:


where is a given matrix of rank , and is a given

constant vector. In this section, we construct a Lagrange multiplier (LM) test statistic

for , where

Here, is the constrained QMLE of under , and and are defined in the same way as and , respectively, with replaced by . The following theorem gives the limiting null distribution of :

Theorem 3.1.

Suppose the conditions in Theorem 2.2 hold. Then, under ,


is the chi-squared distribution with the degrees of freedom


Based on Theorem 3.1, we can set the rejection region of at level as where is -upper percentile of .

As , our has the adaption feature, and it has a much broader application scope than the existing LM tests. Specifically, the LM test in Bollerslev (1986) is only applicable for the stationary GARCH model, but our has the superior ability to tackle the non-stationary S-GARCH model. For the case of

, the score test in Patilea and Raïssi (2014) can detect the null hypothesis that all

are zeros, and the Wald test in Truquet (2017) can check the null hypothesis that some of are zeros. However, these two tests are not applicable for the general cases, and their extensions to include GARCH parameters seems non-trivial. Besides , the Wald and likelihood ratio tests could also be constructed for . When some of or are allowed to be zeros as in our setting, the Wald and likelihood ratio tests render non-standard limiting null distributions (see Francq and Zakoïan (2010) for general discussions), which have to be simulated by the bootstrap method. For practical convenience, we thus only focus on the LM test in this paper, and the consideration of Wald and likelihood ratio tests is left for future study.

4 Portmanteau test

Since Ljung and Box (1978), the portmanteau test and its variants have been a common tool for checking the model adequacy in time series analysis. For the stationary GARCH model, Li and Mak (1994) proposed a portmanteau test for model checking. However, their test is invalid for the non-stationary S-GARCH model. In this section, we follow the idea of Li and Mak (1994) to construct a new portmanteau test to check the adequacy of S-GARCH model, and our test seems the first formal try in the context of semi-parametric time series analysis.

Let be the model residual defined as in (2.9). The idea of our portmanteau test is based on the fact that is a sequence of uncorrelated random variables under (1.1)–(1.2). Hence, if the S-GARCH model is correctly specified, it is expected that the sample autocorrelation function of at lag , denoted by , is close to zero, where

with being the sample mean of . Let for some integer , and


be a symmetric matrix, where with , with , and with . To facilitate our portmanteau test, we need the limiting distribution of in the following theorem:

Theorem 4.1.

Suppose the conditions in Theorem 2.2 hold. Then, if the S-GARCH model in (1.1)–(1.2) is correctly specified,

where , and and are defined in (4.1)–(4.2).

As in Remark 2, can be consistently estimated by its sample version . Based on , our portmanteau test is defined as

If the S-GARCH model is correctly specified, we have as by Theorem 4.1. Therefore, if the value of is larger than , the fitted S-GARCH model is inadequate at level . Otherwise, it is adequate at level . We shall hightlight that also has the adaption feature as , and it is essential to detect the adequacy of the short run GARCH component but not the long run component , since the form of is unspecified in the S-GARCH model. In practice, the choice of lag depends on the frequency of the series, and one can often choose to be , which delivers 6, 9 or 12 for a moderate (see Tsay (2008)).

5 Comparisons with other estimation methods

This section compares our two-step estimation method with the three-step estimation method in Hafner and Linton (2010) and the variance target (VT) estimation method in Francq, Horváth and Zakoïan (2011).

5.1 Comparison with three-step estimation method

Our two-step estimation method is the same as the first two estimation steps in Hafner and Linton (2010), where they gave the following asymptotic normality result for the S-GARCH() model:

where with , , and

In view of the expression of , we can also find that is adaptive, but this point has not been pointed out by Hafner and Linton (2010). Indeed, we can show that and are equivalent. Since involves three infinite summations , and , it is not easy for estimation. On the contrary, our has a much simpler expression, and it can be directly estimated as shown in Remark 2.

In Hafner and Linton (2010), they further proposed an updated estimator at step three, and claimed this updated estimator can achieve the semiparametric efficiency bound when . Following the idea of Hafner and Linton (2010), we can also update our estimator to at step three, where



Here, and are defined as in (2.9). Below, we give the limiting distribution of .

Theorem 5.1.

Suppose the conditions in Theorem 2.2 hold. Then,

where with and

The preceding theorem shows that can not achieve the semiparametric efficiency bound, since . Hence, it seems unnecessary to consider the third estimation step in Hafner and Linton (2010). Note that the updating procedure in (5.1) was also given by Bickel, Klaassen, Ritov and Wellner (1993), in which they showed the updated estimator can achieve the semiparametric efficiency bound when the data are independent. However, when the data are dependent, their conclusion may not be true as demonstrated by Theorem 5.1. The failure of in our case possibly results from the violation of the following condition:


where is defined in the same way as in (5.1) with replaced by . In Bickel, Klaassen, Ritov and Wellner (1993), a similar condition as (5.2) was proved for the independent data. However, their technical treatment does not work in our time series data setting, since our kernel estimator using the data is correlated with , while this is not the case if