# Comments on the presence of serial correlation in the random coefficients of an autoregressive process

Through this note, we intend to show that the presence of serial correlation in the random coefficients of an autoregressive process, although likely in a chronological context, may lead to inappropriate conclusions. To this aim, we consider an RCAR(p) process and we establish that the standard estimation lacks consistency as soon as there exists a nonzero serial correlation in the coefficients. We give the correct asymptotic behavior of the statistic and some simulations come to strengthen our point.

## Authors

• 3 publications
• 2 publications
• ### Consistency Results for Stationary Autoregressive Processes with Constrained Coefficients

We consider stationary autoregressive processes with coefficients restri...
06/08/2017 ∙ by Alessio Sancetta, et al. ∙ 0

• ### Sharp Large Deviations for empirical correlation coefficients

We study Sharp Large Deviations for Pearson's empirical correlation coef...
09/12/2019 ∙ by Thi Truong, et al. ∙ 0

• ### Random autoregressive models: A structured overview

Models characterized by autoregressive structure and random coefficients...
09/17/2020 ∙ by Marta Regis, et al. ∙ 0

• ### Nonparametric sign prediction of high-dimensional correlation matrix coefficients

We introduce a method to predict which correlation matrix coefficients a...
01/30/2020 ∙ by Christian Bongiorno, et al. ∙ 0

• ### T-statistic for Autoregressive process

In this paper, we discuss the distribution of the t-statistic under the ...
09/11/2018 ∙ by Eric Benhamou, et al. ∙ 0

• ### Asymptotic efficiency in the Autoregressive process driven by a stationary Gaussian noise

10/20/2018 ∙ by Marius Soltane, et al. ∙ 0

• ### A Note on Implementing a Special Case of the LEAR Covariance Model in Standard Software

Repeated measures analyses require proper choice of the correlation mode...
07/26/2017 ∙ by Sean L. Simpson, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction and Motivations

The estimation of linear chronological models seems an essentially complete field and research is nowadays rather focused on algorithmic procedures or selection of models. Instead, non-linear time series remain in the foreground. This note is devoted to the estimation issue in the random coefficients non-linear generalization of the autoregressive process or order , that is usually defined as

 Xt=p∑k=1(θk+bk,t)Xt−k+εt

where acts as a uncorrelated randomness included in the coefficients, independent of . We refer the reader to the monograph of Nicholls and Quinn [13] for a rich introduction to the topic. The conditions of existence of a stationary solution to this equation have been widely studied since the seminal works of Anděl [1] or Nicholls and Quinn [12] in the 1980s, see also the more recent paper of Aue, Horváth and Steinebach [3]. The estimation of has been extensively developed to this day as well. Nicholls and Quinn [11]

suggest to make use of the OLS which turns out to be the same, whether there is randomness or not in the coefficients. Under the stationarity conditions, the OLS is known to be strongly consistent and asymptotically normal in both cases (but with different variances). Looking for a unified theory, that is, irrespective of stationarity issues, Aue and Horváth

[2] and Berkes, Horváth and Ling [5] in the 2000s show that the QMLE is consistent and asymptotically normal as soon as there is some randomness in the coefficients, the variance of which allows to circumvent the well-known unit root issues. Later in 2014, Hill and Peng [9] develop an empirical likelihood estimation asymptotically normal even when the coefficients are non-random. Let us also mention the WLS approach of Schick [15] and the M-estimation of Koul and Schick [10] adapted to the stationary framework, in the late 1990s, succeeding in getting asymptotic normality and variance optimality. Yet, the OLS estimation, easy to compute, that does not require numerical optimization or additional parametrization, is still very popular. We shall note that all these works, except Nicholls and Quinn [12]-[11], are related to the first-order process. By contrast, studies on general multivariate RCAR processes do not seem widespread in the literature, to the best of our knowledge. In this paper, we are interested in the implications of serial correlation in the random coefficients of a

-order RCAR process. Indeed, our main statement is that, in a chronological context, while serial correlation is assumed between consecutive values of the process, it is unlikely that the coefficients, from the moment they are considered as random, behave like white noises, which is however part of the fundamental hypotheses of the usual theory of the RCAR. From that point of view, we intend to show that the presence of serial correlation in the random coefficients may lead to inappropriate conclusions through the OLS. This work can be seen as a partial generalization of Proïa and Soltane

[14] where in particular, for the first-order RCAR process, the lack of consistency is established together with the correct behavior of the statistic. To this aim, consider the RCAR process of order generated by the autoregression

 (1.1) Xt=p∑k=1(θk+αkηk,t−1+ηk,t)Xt−k+εt

where and are uncorrelated strong white noises with th-order moments and respectively. The main argument to accredit this model is Prop. 3.2.1 of [7] which states that any stationary process with finite memory can be expressed by a moving average structure. In other words, we introduce a lag-one memory in the random coefficients as a simple case study meant to illustrate the effect on the estimation. In Section 2, we introduce the hypotheses that we must retain and we detail the asymptotic behavior of the standard OLS in presence of serial correlation, under the stationarity conditions. In particular, we put in light a disturbing consequence when testing the significance of . Some simulations come to strengthen our point in Section 3 while Section 4 contains the proof of our results. Finally, we postpone to the Appendix the purely computational steps, for the sake or readability.

## 2. Influence of serial correlation in the coefficients

It will be convenient to write (1.1

) in the vector form

 (2.1) Φt=(Cθ+Nt−1Dα+Nt)Φt−1+Et

where , , is a matrix with in the first row and 0 elsewhere, and is the companion matrix of the underlying AR

process. We already make the assumption that any odd moment of

and is zero as soon as it exists (), and that the distribution of the noises guarantees

 (2.2) E[ln∥Cθ+N0Dα+N1∥]<0%andE[ln+|ε0|]<+∞.

Those moments conditions are assumed to hold throughout the study. In addition, numerous configurations of the parameters are pathological cases. To keep it as clear as possible for the reader, they are put together in a set called that will be updated whenever necessary during the reasonings. As for the moments of the process, the hypotheses that shall be retained are related to the space where the parameters , , and live.

• [label=]

• .

• .

We give in Appendix B some trails to express the moments of the process in terms of the parameters only. We will see that, while is quite easy to describe, is frighteningly long to dissect and would deserve a lot more calculations than what we can afford in this paper. First, we have the following causal representation showing in particular that the process is adapted to the filtration

 (2.3) Ft=σ((εs,η1,s,…,ηp,s),s⩽t).
###### Proposition 2.1.

For all ,

 (2.4) Φt=Et+∞∑k=1Et−kk−1∏ℓ=0(Cθ+Nt−ℓ−1Dα+Nt−ℓ)a.s.

In consequence, is strictly stationary and ergodic.

###### Proof.

See Section 4.1. ∎

Suppose now that is an available trajectory from which we deduce the standard OLS estimator of , the mean value of the coefficients,

 (2.5)

To simplify, the initial vector is assumed to follow the stationary distribution of the process. It is important to note that (2.5) is the OLS of with respect to

 Sn(θ)=n∑t=1(Xt−θTΦt−1)2

even when , as it is done in Sec. 3 of [11], because our interest is precisely to show that the usual estimation may lead to inappropriate conclusions in case of misspecification of the coefficients. It is also noteworthy that a nice consequence of the ergodicity for our reasonings is that, from the causal representation,

 (2.6) 1nn∑t=1Yt a.s.⟶ E[Y]for anyYt=m∏i=1Xt−diηt−d′i

provided that (). Hence, the isolated terms must also satisfy a.s. In all the study, the isolated terms will be referred to as “second-order” when they take the form of with . Thus, any second-order isolated term is such that a.s. when . The autocovariances of the stationary process are denoted by and, by ergodicity, the sample covariances are strongly consistent estimators, that is

 (2.7) 1nn∑t=1XtXt−i a.s.⟶ ℓi.

Based on the autocovariances, we build

 (2.8) Λ0=⎛⎜ ⎜ ⎜ ⎜ ⎜⎝ℓ0ℓ1⋯ℓp−1ℓ1ℓ0⋯ℓp−2⋮⋮⋮ℓp−1ℓp−2⋯ℓ0⎞⎟ ⎟ ⎟ ⎟ ⎟⎠andL1=⎛⎜ ⎜ ⎜ ⎜⎝ℓ1ℓ2⋮ℓp⎞⎟ ⎟ ⎟ ⎟⎠.

The matrix is clearly positive semi-definite but the case where it would be non-invertible is now part of . This is the multivariate extension of the condition in [14], this can also be compared to assumption (v) in [11]. The asymptotic behavior of (2.5) is now going to be studied in terms of convergence, normality and rate.

###### Theorem 2.1.

Assume that . Then, we have the almost sure convergence

 ˆθn a.s.⟶ θ∗=Λ−10L1.
###### Proof.

The proof is immediate from the ergodic theorem, provided that . ∎

We will see in Remark B.1 of Appendix B.1 that it is possible to give an expression of which, although not explicit, only depends on the parameters of the model. In particular, this expression clearly confirms that as soon as , as it is already established in Thm. 4.1 of [11]. However, it is possible to show that, except for (see [14]), we generally do not have when , and we will use this disturbing fact in the short example of Section 3. Note that the fourth-order moments of and the second-order moments of are involved in , to reach the almost sure convergence.

###### Theorem 2.2.

Assume that . Then, there exists a limit matrix such that we have the asymptotic normality

 √n(ˆθn−θ∗) D⟶ N(0,Λ−10LΛ−10).
###### Proof.

See Section 4.2. ∎

The proof of the theorem and Appendix B.2 highlight the need to retain eighth-order moments for and fourth-order moments for , to build .

###### Theorem 2.3.

Assume that . Then, we have the rate of convergence

 limsupn→+∞ n2lnlnn∥ˆθn−θ∗∥2<+∞a.s.
###### Proof.

See Section 4.3. ∎

In particular, we can see that despite the correlation in the noise of the coefficients, the hypotheses are sufficient to ensure that the estimator reach the usual rate of convergence in stable autoregressions, i.e.

## 3. Illustration and Perspectives

To conclude this short note, let us illustrate a disturbing consequence of the presence of correlation in the coefficients when testing the significance of . For the sake of simplicity, take and consider the test of against in an autoregressive setting. Thanks to Remark B.1, it is possible to show that, under ,

 θ∗1=θ1(θ21+β2−(1−β1)2−β1(β1+β2−1))(θ1−2β1−β2+1)(θ1+2β1+β2−1)

and

 θ∗2=θ21β2−β1((1−β2)2+4β1(β1+β2−1))(θ1−2β1−β2+1)(θ1+2β1+β2−1)

where and . As a consequence, is generally almost surely divergent and even if , we may detect a non-zero-mean coefficient where there is in fact a zero-mean autocorrelated one. Worse, suppose also that so that there is no direct influence of on . Then, we still generally have . In other words, a correlation in the random coefficient associated with , i.e. , generates a spurious detection of a direct influence of on . This phenomenon can be observed on some simple but representative examples. To test , we infer two statistics from Thm. 4.1 of [11] and the procedure of estimation given by the authors, that we call

The first one takes into account random coefficients (which means that and are estimated to get ) and the second one is built assuming fixed coefficients ( and are not estimated but set to 0). Note that we make sure that when we vary the settings. Unsurprisingly, is left behind since it does not model the random effect owing to and the corrected statistic behaves as expected when .

However, as it is visible on Figure 1 and confirmed by Figure 2, when both tests reject with a rate growing well above the 5% threshold that we decided to retain in this experiment even if, for the same reason as before, appears slightly more robust. This is indeed what theory predicts, as we tried to show throughout the paper. Theorem 2.2 also confirms that, once correctly recentered, and must remain asymptotically normal. This example clearly highlights the question of whether we can build a consistent estimate for of course, but also for the covariance in the coefficients. In the previous work [14], this is done (see Sec. 4), mainly due to the fact that calculations are feasible when , and a test for serial correlation followed. We can see through this note that this is not as easy in the general case and that this must be a trail for future studies. Besides, we only considered the OLS but it would be worth working either with a QMLE or with a two-stage procedure to be able to exhibit a reliable estimate despite a possible serial correlation. On the whole, testing for serial correlation should appear as logic consequence of the tests for randomness in the coefficients that already exist in the literature, especially when the main hypothesis is the existence of temporal correlations in the phenomenon being modeled.

Acknowledgements. The authors thank the research program PANORisk of the Région Pays de la Loire in which this article participates.

## 4. Technical proofs

### 4.1. Proof of Proposition 2.1

From our hypotheses on the noises , the matrix-valued process is strictly stationary and ergodic. Thus, it follows from (2.2) that we can find and a random such that, as soon as ,

 1kk−1∑ℓ=0ln∥Cθ+Nt−ℓ−1Dα+Nt−ℓ∥<δa.s.

See also Lem. 1.1 of [6] for a similar reasoning. For , consider the truncation

 Φt,n=Et+n∑k=1Et−kk−1∏ℓ=0(Cθ+Nt−ℓ−1Dα+Nt−ℓ).

Then, by the triangle inequality, for large enough we have

 ∥Φt,n∥ ⩽ ∥Et∥+n∑k=1∥Et−k∥k−1∏ℓ=0∥Cθ+Nt−ℓ−1Dα+Nt−ℓ∥ ⩽ |εt|+k0−1∑k=1|εt−k|k−1∏ℓ=0∥Cθ+Nt−ℓ−1Dα+Nt−ℓ∥+n∑k=k0|εt−k|eδk

and Lem. 2.2 of [4] ensures the a.s. convergence of the last term under (2.2). Thus,

 limsupn→+∞ ∥Φt,n∥<+∞a.s.

so that (2.4

) is finite with probability 1. Moreover, it is easy to check that this is a solution to the recurrence (

2.1). Finally, the strict stationarity and ergodicity of may be obtained following the same reasoning as in [11]. Indeed, one can see that there exists independent of such that and the strict stationarity and ergodicity of are passed to . That also implies the -measurability of the process. ∎

### 4.2. Proof of Theorem 2.2

Let the filtration generated by and, for , by

 (4.1) F∗n=σ(Φ0,η1,0,…,ηp,0,(ε1,η1,1,…,ηp,1),…,(εn,η1,n,…,ηp,n)).

In the sequel, to avoid a huge amount of different notations, and will be generic terms, not necessarily identical from one line to another, designating vector -martingales (see e.g. [8]) and isolated terms, respectively. We make use of Appendix A for some computational results and we start by two fundamental propositions showing that the recentered empirical covariances are -martingales, except for residual terms. To this aim, we need to build a matrix for which we detail the steps below. First, consider

 Mθ=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝θ10⋯⋯0θ2⋱⋱⋮⋮⋱⋱⋱⋮⋮⋱⋱0θp⋯⋯θ2θ1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠+⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝0θ2⋯⋯θp⋮⋮\vbox...0⋮⋮\vbox...⋮⋮θp⋮0⋯⋯⋯0⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠

and combine

 Mα,β=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝11−2β10⋯⋯001⋱⋮⋮⋱⋱⋱⋮⋮⋱⋱00⋯⋯01⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣Mθ+⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝0β20⋯0β100⋮0⋱⋱⋱⋮⋮⋱⋱⋱00⋯0β10⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦

where and . Then, set as the first column of and set as the remaining part of to which we add a zero vector on the right. Using the notations of the reasonings below, the pathological set is enhanced with the situations where , , , or .

###### Proposition 4.1.

Assume that . Then, we have the decomposition

 n∑t=1Φt−1Xt−nL1 = (Ip−K)−1U0(n∑t=1X2t−nℓ0)+Mn+Rn

where is a vector -martingale and the remainder satisfies a.s.

###### Proof.

The proposition is established through Lemmas A.3 and A.4. Indeed, these statements show that, as soon as , there is a decomposition of the form

 1nn∑t=1Φt−1Xt = (Ip−K)−1U01nn∑t=1X2t+Mnn+Rnn.

By ergodicity, taking the limit on both sides gives . The remainder is made of second-order isolated terms so, by the remark that follows (2.6) and because , we must have a.s. ∎

###### Proposition 4.2.

Assume that . Then, we have the decomposition

 n∑t=1X2t−nℓ0 = Mn+Rn

where is a scalar -martingale and the remainder satisfies a.s.

###### Proof.

This is a consequence of Lemma A.7. As soon as , it will follow from this lemma, picking up the same notations, that

 n∑t=1X2t−nℓ0 = UT1−s0(n∑t=1Φt−1Xt−nL1)+Mn+Rn.

Together with Proposition 4.1, we obtain

 (n∑t=1X2t−nℓ0)[1−UT(Ip−K)−1U01−s0] = UTMn+Rn1−s0

which concludes the proof since, likewise, the remainder is a linear combination of second-order isolated terms. ∎

We now come back to the proof of Theorem 2.2. The keystone of the reasoning consists in noting that there is a matrix such that

 n∑t=1Φt−1Xt−Sn−1θ∗=An∑t=1Φt−1Xt−θ∗n∑t=1X2t+Rn.

Dividing by and taking limit on both sides gives . Thus,

 n∑t=1Φt−1Xt−Sn−1θ∗=A(n∑t=1Φt−1Xt−nL1)−θ∗(n∑t=1X2t−nℓ0)+Rn.

The combination of Propositions 4.1 and 4.2 enables to obtain the decomposition

 (4.2) n∑t=1Φt−1Xt−Sn−1θ∗=Mn+Rn

where, as in the previous proofs, is a vector -martingale and the remainder satisfies a.s. Let us call a generic element of . As can be seen from the details of Appendix A, it always takes the form of

 (4.3) mn=n∑t=1Xa1t−d1Xa2t−d2ηa3k,t−d3ηa4ℓ,t−d4νt

where ,

and the zero-mean random variable

is identically distributed and independent of . For the case in hand, the predictable quadratic variation would be

 (4.4) ⟨m⟩n=vνn∑t=1X2a1t−d1X2a2t−d2η2a3k,t−d3η2a4ℓ,t−d4

where provided that and . The ergodicity arguments (2.6) together with the fact that show that has an almost sure limit. Generalizing to , we can explicitly build a limit matrix such that

 (4.5) ⟨M⟩nn a.s.⟶ L.

From the causal expression of Proposition 2.1 and the hypothesis on , the increments of are also strictly stationary and ergodic. Thus, it follows that for any ,

 1nn∑t=1E[(Δmt)21{|Δmt|⩾M}|F∗t−1] a.s.⟶ E[(Δm1)21{|Δm1|⩾M}].

Since , the right-hand side can be made arbitrarily small. Once again generalizing to , we obtain via the same arguments that the Lindeberg’s condition

 (4.6) 1nn∑t=1E[∥ΔMt∥21{∥ΔMt∥⩾ε√n}|F∗t−1] P⟶ 0

is satisfied for any . From (4.5) and (4.6

), we are now ready to apply the central limit theorem for vector martingales, given e.g. by Cor. 2.1.10 of

[8], and get

 (4.7) Mn√n D⟶ N(0,L).

The combination of (4.2), the remark that follows and (4.7) leads to

 1√n(n∑t=1Φt−1Xt−Sn−1θ∗) D⟶ N(0,L).

Finally, the a.s. convergence of to and Slutsky’s lemma conclude the proof. ∎

### 4.3. Proof of Theorem 2.3

Let us come back to the -martingale given in (4.3) treated as a generic component of . By the Hartman-Wintner law of the iterated logarithm for martingales expounded i.e. in [16], and because our hypothesis on guarantees that has strictly stationary and ergodic increments,

 limsupn→+∞ mn√2nlnlnn=vmandliminfn→+∞ mn√2nlnlnn=−vma.% s.

where , provided that . The inferior limit was reached by replacing by , which share the same variance and martingale properties. Exploiting the latter bounds and generalizing to , we can deduce that

 (4.8) limsupn→+∞ ∥Mn∥√2nlnlnn<+∞a.s.

Because decomposition (4.2) implies , it is now easy to conclude the proof via (4.8). Indeed, we recall that is convergent and that , obviously implying that it is also a.s. ∎

## Appendix A Some martingales decompositions

For an easier reading, let us gather in this appendix the proofs only based on calculations. Like in Section 4, , and will be generic terms, not necessarily identical from one line to another, designating -martingales, differences of -martingales and isolated terms, respectively, where is the filtration defined in (4.1). To save place, we deliberately skip the proofs for most of them because they only consist of calculations. We assume in all the Appendix that .

###### Lemma A.1.

For all , we have the following equivalences.

• [label=]

• For and , .

• For and , .

• .

• For , .

• .

###### Lemma A.2.

We have the decomposition

 n∑t=1X2tη1,t = 2τ1,21−2β1[p∑k=1θkn∑t=1XtXt−k+1+β2n∑t=1XtXt−1]+Mn+Rn

where and .

###### Proof.

Develop using (1.1) and then exploit the relations of Lemma A.1 to get that, for all and after simplifications,

 X2tη1,t=2τ1,2(p∑k=1θkXt−1Xt−k+p∑k=2αkτk,2Xt−kXt−k−1+α1X2t−1η1,t−1)+δt.

It remains to sum over and to gather all equivalent terms. Note that . ∎

###### Lemma A.3.

Let . Then, we have the decomposition

 n∑t=1XtXt−i = p∑k=1θkn∑t=1XtXt−|i−k|+β1n∑t=1XtXt−|i−2|+Mn+Rn

where .

###### Proof.

Similarly, develop using (1.1) and use Lemma A.1 to get that, for all ,

 XtXt−i=p∑k=1θkXt−iXt−k+α1τ1,2Xt−2Xt−i+δt.

Finally, sum over . ∎

###### Lemma A.4.

We have the decomposition

 n∑t=1XtXt−1 = 11−2β1[p∑k=1θkn∑t=1XtXt−k+1+β2n∑t=1XtXt−1]+Mn+Rn

where and .

###### Proof.

Once again developing , it follows from Lemma A.1 that, for all ,

 XtXt−1=p∑k=1θkXt−1Xt−k+α1X2t−1η1,t−1+p∑k=2αkτk,2Xt−kXt−k−1+δt.

Then, we use Lemma A.2 to get the summation. Note that . ∎

###### Lemma A.5.

For all , we have the following equivalences.

• [label=]

• For , .

• For , .

• .

• For , .