We consider a general class of affine causal time series models in a semiparametric setting. Let
be a sequence of centered independent and identically distributed (iid) random variables satisfyingand a compact subset of (). For and any , define
Class : A process belongs to if it satisfies:
where are two measurable functions. The existence of a stationary and ergodic solution as well as the inference for the class have been addressed by Bardet and Wintenberger (2009). Numerous classical time series such as AR(), ARCH(), TARCH() or ARMA-GARCH models belong to this class (see Bardet and Wintenberger (2009)). This class of models has now been well, see for instance Bardet et al. (2012), Kengne (2012), Bardet and Kengne (2014) for change-point detection on this class; Bardet et al. (2017) for inference based on the Laplacian quasi-likelihood; Bardet et al. (2020), Kengne (2020) for model selection in this class.
We focus here on the epidemic change-point detection in the class . Assume that a trajectory of the process is observed and consider the following test hypotheses:
is a trajectory of the process with .
: there exists (with and ) such that belongs to .
The epidemic alternative H refers to the so-called epidemic period, which runs from to .
Several works in the literature are devoted to the epidemic change-point detection in time series. We refer among others, to Levin and Kline (1985), Yao (1993), Csörgö and Horváth (1997), Ramanayake and Gupta (2003), Račkauskas and Suquet (2004), Račkauskas and Suquet (2006), Guan (2007), Jarušková and Piterbarg (2011), Aston and Kirch (2012a, 2012b), Bucchia (2014), Graiche et al. (2016). As pointed out by Diop and Kengne (2021), most of these procedures are developed for the epidemic change-point detection in the mean of random variables. The latter authors addressed this issue for a general class of integer valued time series.
In this new contribution, we propose a test based on the Gaussian quasi-likelihood for the epidemic change-point detection in the class of affine causal models . Under the null hypothesis of no change, the proposed statistic converges to a distribution obtained from a difference between two Brownian bridges; this statistic diverges to infinity under the epidemic alternative. These findings lead to a test which has correct size asymptotically and is consistent in power.
The rest of the paper is outlined as follows. Section 2 provides some assumptions and the definition of the Gaussian quasi-likelihood. Section 3 focuses on the construction of the test statistic and the asymptotic studies under the null and the epidemic alternative. Some numerical results for simulation and real data example are displayed in Section 4. Section 5 is devoted to the proofs of the main results.
2 Assumptions and QMLE
Throughout the sequel, we use the following notations:
, for any ;
, for any matrix ; where denotes the set of matrices of dimension with coefficients in ;
for any function ;
for any such as .
In the sequel, 0 denote the null vector of any vector space. For and any compact set , define
Assumption A (): Assume that and there exists a sequence of non-negative real number such that satisfying
where , , , are respectively replaced by , , , if .
For any , define
These Lipschitz-type conditions are notably useful when studying the existence of solutions of the class . If , then there exists a -weakly dependent stationary and ergodic solution satisfying (see Doukhan and Wintenberger (2008) and Bardet and Wintenberger (2009)).
Consider a trajectory of a process . If , then for any segment , the conditional Gaussian quasi-(log)likelihood computed on is given by,
where , and . In the sequel, we deal with an approximated quasi-(log)likelihood contrast given for any segment by,
with , and ; and consider the estimator,
The following assumptions are needed to study the asymptotic behavior of the estimator defined in (2.2). Assumption D: such that for all
Assumption Id(): For a process and for all ,
Assumption Var(): For a process , one of the families or is linearly independent.
Under H and the above assumptions, Bardet and Wintenberger (2009) established the consistency and the asymptotic normality of the estimator for the class .
3 Test statistic and asymptotic results
Under H, recall that (see Bardet and Wintenberger (2009)), for the class , it holds that
where denotes the transpose. For any segment , consider the following matrices,
Under H, and are consistent estimators of and , respectively.
In the sequel, we follow the idea of Diop and Kengne (2021). Let , be two integer valued sequences such that: and . For all , define the matrix
where , , are replaced by 0 if these matrices are not invertible. Also, define the set
For all , set
and consider the test statistic
As pointed out by Diop and Kengne (2021), this test statistic coincides with those proposed by Rackauskas and Suquet (2004) (statistic ), Jarusková and Piterbarg (2011) (statistic ), Bucchia (2014) (statistic ) or Aston and Kirch (2012) (statistic ) for the particular case of epidemic change-point detection in the mean. In this sense, the test considered here can be seen as a generalization these procedures.
The following theorem provides the asymptotic behavior of the statistic under the null hypothesis. In the condition (3.6) in this theorem, we make the convention that if A holds, then for all and if A holds, then for all .
Under H with , assume that D, Id(), Var() (for the class ), A, A (or A) hold with
where is a -dimensional Brownian bridge.
For any , denote the
-quantile of the distribution of. Therefore, at a nominal level , the critical region of the test is ; which leads to a procedure with correct size asymptotically. Table 1 of Diop and Kengne (2021) provides the values of for and .
For asymptotic under the epidemic alternative, the following additional condition is needed.
Assumption B: There exists such that (with is the integer part).
We have the following result.
Under with , assume that D, Id(), Var() (for the classes and ), A, A (or A) and (3.6) hold. Then,
This theorem shows that the proposed procedure is consistency in power. An estimator of the change-points under the epidemic alternative is given by
4 Some numerical results
This section presents some results of a simulation study and a real data example. For a sample size , the statistic is computed with and (see also Remark 1 in Kengne (2012)). The empirical levels and powers are obtained after 200 replications at the nominal level .
4.1 Simulation study
We consider the following models:
(i) ARMA(1,1) processes:
The parameter of the model is , where is a compact subset of such as: for all , . Since we can write for all ,
the model (4.1) belongs to the class with and for all . For this model, the Lipschitz-type conditions A () as well as D are automatically satisfied. Moreover, if is a non-degenerate random variable, then the assumptions Id() and Var() hold; and for any such that , . In the sequel, we deal with an ARMA(1,1) with a non zero mean (), an ARMA(1,1) with mean zero () and an AR(1) with a non zero mean ().
We consider the change-point test with an epidemic alternative where the parameter of the model is under H, and , under H. Firstly, two trajectories of an ARMA(1,1) with mean zero are generated: a trajectory under H with and a trajectory under H with breaks at , , . Figure 1 displays the statistic . One can see that, for the scenario without change, the values of this statistic are below the horizontal triangle which represents the limit of the critical region (see Figure 1(a)). Under the epidemic alternative, is greater than the critical value of the test and is reached around the points where the changes occur (see the dotted lines in Figure 1(b)).
(ii) GARCH(1,1) processes:
the parameter , a compact subset of such as: for all , . For all , we get
Therefore, the model (4.2) belongs to the class with and for all . The Lipschitz-type conditions A () hold automatically and D is satisfied with . In addition, if is a non-degenerate random variable, then the assumptions Id() and Var() hold; and for any such that , . In the sequel, we consider a GARCH(1,1) () and an ARCH(1) ().
For both the ARMA and GARCH model, we carry out the change-point test with an epidemic alternative where the parameter of the model is under H, and , under H with change-points at for sample size . The empirical levels and powers are displayed in Table 1. The AR example is related to the real data application, see subsection 4.2. The results in this table show that, the empirical level approaching the nominal one when increases and the empirical power increases with and is overall close to one when . These findings are consistent with the asymptotic results of Theorems 3.1 and 3.2.
|ARMA(1,1) with zero mean||Empirical levels:||0.045||0.050|
|ARMA(1,1) with non zero mean||Empirical levels:||0.065||0.060|
4.2 Real data example
We consider the daily concentrations of carbon monoxide in the Vitória metropolitan area. These daily levels are obtained from the State Environment and Water Resources Institute, where the data were collected at eight monitoring stations. There are available observations that represent the average concentrations from September 11, 2009 through December 09, 2010 (see Figure 2(a)). The data are a part of a large dataset (available at https://rss.onlinelibrary.wiley.com/pb-assets/hub-assets/rss/Datasets/RSSC%2067.2/C1239deSouza-1531120585220.zip) which were analyzed by Souza et al. (2018) to quantify the association between respiratory disease and air pollution concentrations.
To test the presence of an epidemic change in this series, we apply our detection procedure with the ARMA() model. We have applied the test with several values of and ; and the results after change-point detection show a preference (in the sense of AIC and BIC) for an AR(1). Figure 2(b) displays the values of for all . The critical value on nominal level is and the resulting test statistic is ; which implies that the null hypothesis H is rejected (i.e., an epidemic change-point is detected). The vector of the break-points estimated is ; i.e, the point where the peak in the graph is reached (see Figure 2(b)). The locations of the changes correspond to the dates January 31 and August 06, 2010. This corresponds to the period where the winds are weaker and the austral winter; these meteorological factors are noticeable to increase the concentration of the carbon monoxide. The estimated model on each regime is given by:
where in parentheses are the standard errors of the estimators. From (4.3), one remark that, the parameter of the first regime is close to that of the third regime; which strengthens the hypothesis of the existence of an epidemic change-point.
5 Proofs of the main results
To simplify the expressions, in this section, we will use the conditional Gaussian quasi-log-likelihood up to multiplication by 1/2, given by and .
5.1 Proof of Theorem 3.1
Let , where and are the matrices defined in (3.2). Define the statistic
Consider the following lemma; we can go along similar lines as in the proof of Lemma 6.3 in Diop and Kengne (2021) to show the part (i). The part (ii) is established in Bardet and Wintenberger (2009).
Suppose that the assumptions of Theorem 3.1 hold. Then,
is a stationary ergodic, square integrable martingale difference sequence with covariance matrix .
Let two integers , and . Applying the mean value theorem to , there exists between and such that