# Model-Free Tests for Series Correlation in Multivariate Linear Regression

Testing for series correlation among error terms is a basic problem in linear regression model diagnostics. The famous Durbin-Watson test and Durbin's h-test rely on certain model assumptions about the response and regressor variables. The present paper proposes simple tests for series correlation that are applicable in both fixed and random design linear regression models. The test statistics are based on the regression residuals and design matrix. The test procedures are robust under different distributions of random errors. The asymptotic distributions of the proposed statistics are derived via a newly established joint central limit theorem for several general quadratic forms and the delta method. Good performance of the proposed tests is demonstrated by simulation results.

## Authors

• 4 publications
• ### Correlation Estimation System Minimization Compared to Least Squares Minimization in Simple Linear Regression

A general method of minimization using correlation coefficients and orde...
02/07/2018 ∙ by Rudy Gideon, et al. ∙ 0

• ### Nonparametric Tests in Linear Model with Autoregressive Errors

In the linear regression model with possibly autoregressive errors, we p...
07/23/2020 ∙ by Olcay Arslan, et al. ∙ 0

• ### Max-sum tests for cross-sectional dependence of high-demensional panel data

We consider a testing problem for cross-sectional dependence for high-di...
07/08/2020 ∙ by Long Feng, et al. ∙ 0

• ### Smoothing-based tests with directional random variables

Testing procedures for assessing specific parametric model forms, or for...
04/01/2018 ∙ by Eduardo García-Portugués, et al. ∙ 0

• ### Linear regression and its inference on noisy network-linked data

Linear regression on a set of observations linked by a network has been ...
07/01/2020 ∙ by Can M. Le, et al. ∙ 0

• ### A non-inferiority test for R-squared with random regressors

Determining the lack of association between an outcome variable and a nu...
02/19/2020 ∙ by Harlan Campbell, et al. ∙ 0

• ### A test for partial correlation between repeatedly observed nonstationary nonlinear timeseries

We describe a family of statistical tests to measure partial correlation...
06/13/2021 ∙ by Kenneth D. Harris, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

Linear regression is an important topic in statistics and has been found to be useful in almost all aspects of data science, especially in business and economics statistics and biostatistics. Consider the following multivariate linear regression model

 (1.1) Y=X′β+ε,

where

is the response variable,

is a

-dimensional vector of regressors,

is a -dimensional regression coefficient vector and is random errors with zero mean. Suppose we obtain samples from this model, that is, with design matrix , where for ,

. The first task in a regression problem is to make statistical inference about the regression coefficient vector. By applying the ordinary least squares (OLS) method, we obtain the estimate

for coefficient vector . In most applications of linear regression models, we need the assumption that the random errors are uncorrelated and homoscedastic. That is to say, we assume

 Cov(εi,εj)={σ2for i=j,0for i≠j,

where are unknown. With this assumption, the Gauss-Markov theorem states that the ordinary least squares estimate (OLSE)

is the best linear unbiased estimator (BLUE). When this assumption does not hold, we suffer from a loss of efficiency and, even worse, make wrong inferences in using OLS. For example, positive serial correlation in the regression error terms will typically lead to artificially small standard errors for the regression coefficient when we apply the classic linear regression method, which will cause the estimated t-statistic to be inflated, indicating significance even when there is in fact none. Therefore, tests for heteroscedasticity and series correlation are important when applying linear regression.

For detecting heteroscedasticity, in one of the most cited papers in econometrics, White [White(1980)] proposed a test based on comparing the Huber-White covariance estimator to the usual covariance estimator under homoscedasticity. Many other researchers have considered this problem, for example, Breusch and Pagan [Breusch and Pagan(1979)], Dette and Munk [Dette and Munk(1998)], Glejser [Glejser(1969)], Harrison and McCabe [Harrison and McCabe(1979)], Cook and Weisberg [Cook and Weisberg(1983)], and Azzalini and Bowman[Azzalini and Bowman(1993)]. Recently, Li and Yao [Li and Yao(2015)] and Bai, Pan and Yin [Bai et al.(2018)Bai, Pan, and Yin] proposed tests for heteroscedasticity that are valid in both low- and high-dimensional regressions. Their tests were shown by simulations to perform better than some classic tests.

The most famous test for series correlation, the Durbin-Watson test, was proposed in [Durbin and Watson(1950), Durbin and Watson(1951), Durbin and Watson(1971)]. The Durbin-Watson test statistic is based on the residuals from linear regression. The researchers considered the statistic

 d=∑ni=2(ei−ei−1)2∑ni=1e2i,

whose small-sample distribution was derived by John von Neumann. In the original papers, Durbin and Watson investigated the distribution of this statistic under the classic independent framework, described the test procedures and provided tables of the bounds of significance. However, the asymptotic results were derived under the normality assumption on the error term, and as noted by Nerlove and Wallis [Nerlove and Wallis(1966)]

, although the Durbin-Watson test appeared to work well in an independent observations framework, it may be asymptotically biased and lead to inadequate conclusions for linear regression models containing lagged dependent random variables. New alternative test procedures, for instance, Durbin’s h-test and t-test

[Durbin(1970)], were proposed to address this problem; see also Inder [Inder(1986)], King and Wu [King and Wu(1991)], Stocker [Stocker(2007)], Bercu and Proïa [Bercu and Proia(2013)], Gençay and Signori [Gençay and Signori(2015)] and Li and Gençay [Li and Gençay(2017)]

and references therein. However, all these tests were proposed under some model assumptions on the regressors and/or the response variable. Moreover, Durbin’s h-test requires a Gaussian distribution of the error term. Thus, some common models are excluded. In fact, since it is difficult to assess whether the regressors and/or the response are lag dependent, model-free tests for the regressors and response variable appear to be appropriate.

The present paper proposes a simple test procedure without assumptions on the response variable and regressors that is valid in both low- and high-dimensional multivariate linear regression. The main idea, which is simple but proves to be useful, is to express the mean and variance of the test statistic by making use of the residual maker matrix. In addition to a general joint central limit theorem for several quadratic forms, which is proved in this paper and may have its own interest, we consider a Box-Pierce-type test for series correlation. Monte Carlo simulations show that our test procedures perform well in situations where some classic test procedures are inapplicable.

## 2. Test for series correlation in linear regression model

### 2.1. Notation

Let be the design matrix, and let be the residual maker matrix, where is the hat matrix (also known as the projection matrix). We assume that the noise vector where is an -dimensional random vector whose entries

are independent with zero means, unit variances and the same finite fourth-order moments

, and is an n-dimensional nonnegative definite nonrandom matrix with bounded spectral norm. Then, the OLS residuals are . We note that we will use to indicate the Hadamard product of two matrices in the rest of this paper.

### 2.2. Test for a given order series correlation

To test for a given order series correlation, for any number , denote

where with

First, for we have

 (2.1) Eγτ=Etrϵ′Σ1/2RPτRΣ1/2ϵ=trΣ1/2RPτRΣ1/2=trRPτRΣ.

Denote and , and set we then have, for

 (2.2) Cov(γτ1,γτ2) = E(ϵ′Σ1/2RPτ1RΣ1/2ϵ−trRPτ1RΣ)(ϵ′Σ1/2RPτ2RΣ1/2ϵ−trRPτ2RΣ) = E⎛⎝∑iai,i(ϵ2i−1)+∑i≠jai,jϵiϵj⎞⎠⎛⎝∑ibi,i(ϵ2i−1)+∑i≠jbi,jϵiϵj⎞⎠ = +tr(Σ1/2RPτ1RΣ1/2)(Σ1/2RPτ2RΣ1/2)′ =

Note that We want to test the hypothesis for

 H0:Σ=σ2In,where 0<σ2<∞,

against

 H1,τ:Cov(εi,εi−τ)=ρ≠0.

Under the null hypothesis, due to (

2.1) and (2.2), we obtain

 (2.3) Eγτ=σ2trPτR,

and

 (2.4)

Specifically, we have and

 (2.5)

The validity of our test procedure requires the following mild assumptions.

(1): Assumption on and :

The number of regressors and the sample size satisfy that as .

(2): Assumption on errors:

The fourth-order cumulant of the error distribution .

Assumption excludes the rare case where the random errors are drawn from a two-point distribution with the same masses at and . However, if this situation occurs, our test remains valid if the design matrix satisfies the mild condition that

 limsupn→∞trR∘Rn−p=limsupn→∞∑ni=1r2i,i∑ni=1ri,i<1.

These assumptions ensure that has the same order as as , thus satisfying the condition assumed in Theorem 4.1.

Define

 mτ=Eγτσ2=trPτR,vτ1τ2=Cov(γτ1,γτ2)nσ4.

By applying Theorem 4.1 presented in Section 4, we obtain that for

 1√n((γτγ0)−(mτm0))∼N(0,(vττvτ0v0τv00)).

Then, by the delta method, we obtain, as ,

 Tτ=√n(γτγ0−μτ0)στ0∼N(0,1),

where and

 (2.6) σ2τ0= (1m0/nm1/n(m0/n)2)(v11v10v01v00)⎛⎜⎝1m0/nm1/n(m0/n)2⎞⎟⎠ = n2(1n−pm1(n−p)2)(v11v10v01v00)⎛⎜⎝1n−pm1(n−p)2⎞⎟⎠.

We reject in favor of if a large is observed.

### 2.3. A portmanteau test for series correlation

In time series analysis, the Box-Pierce test proposed in [Box and Pierce(1970)] and the Ljung-Box statistic proposed in [Ljung and Box(1978)] are two portmanteau tests of whether any of a group of autocorrelations of a time series are different from zero. For a linear regression model, consider the following hypothesis

 H0:Σ=σ2In,

against

 H1:there\ exist 1≤τ≤q such\ that Cov(εi,εi−τ)=ρ≠0.

Applying Theorem 4.1

and the delta method, we shall now consider the following asymptotically standard normally distributed statistic

 T(q)=√n(∑qτ=1(2−γτγ0)2−∑qτ=1(2−μτ0)2)σT∼N(0,1),

as where and with

 ∇=⎛⎜⎝2n∑qτ=1mτ(2−mτn−p)(n−p)2,−2n(2−m1n−p)(n−p),−2n(2−m2n−p)(n−p),⋯,−2n(2−mqn−p)(n−p)⎞⎟⎠′,

and

 ΣT=(vi,j)(q+1)×(q+1),i,j=0,⋯,q.

Then, we reject in favor of if is large.

### 2.4. Discussion of the statistics

In the present subsection, we discuss the asymptotic parameters of the two proposed statistics.

If the entries in design matrix are assumed to be i.i.d. standard normal, then we know that as , the diagonal entries in the symmetric and idempotent matrices and are of constant order while the off-diagonal entries are of order . Then, the order of for a given is at most since it is exactly the summation of the off-diagonal entries of . Thus, elementary analysis shows that .

For a fixed design or a more general random design, it become almost impossible to study matrices and , except for some of the elementary properties. Thus, for the purpose of obtaining an accurate statistical inference, we suggest the use of the original parameters since we have little information on the distribution of the regressors in a fixed design, and the calculation of those parameters is not excessively complex.

## 3. Simulation studies

In this section, Monte Carlo simulations are conducted to investigate the performance of our proposed tests.

### 3.1. Performance of test for first-order series correlation

First, we consider the test for first-order series correlation of the error terms in multivariate linear regression model (1.1). Note that although our theory results were derived by treating the design matrix as a constant matrix, we also need to obtain a design matrix under a certain random model in the simulations. We thus consider the situation where the regressors are lagged dependent. Formally, for a given , we set

 xt,j=rxt−1,j+ut,j=1,⋯,f,t=1,⋯,n,

where and are independently drawn from N(0,1). While

are independently chosen from a Student’s t-distribution with 5 degrees of freedom. The random errors

obey (1) the normal distribution N(0,1) and (2) the uniform distribution U(-1,1). The significant level is set to

Table 1 and Table 2 show the empirical size of our test (denoted as ) for different under the two error distributions. To investigate the power of our test, we randomly choose a and consider the following AR(1) model:

 εt=ρεt−1+ϑt,t=1,⋯,n

where are independently drawn from (1) N(0,1) and (2) U(-1,1). Tables 3 and 4 show the empirical power of our proposed test for different under the two error distributions.

These simulation results show that our test always has good size and power when is large and is thus applicable under the framework that as .

### 3.2. Performance of the Box-Pierce type test

This subsection investigates the performance of our proposed Box-Pierce type test statistic in subsection 2.3. The design matrix is obtained in the same way as in the last subsection, with

, and the random error terms are assumed to obey a (1) normal distribution N(0,1) and a (2) gamma distribution with parameters 4 and 1/2. Table

5 and Table 6 show the empirical size of our test with different under the two error distributions. We consider the following AR(2) model to assess the power:

 εt=ρ1εt−1+ρ2εt−2+ϑt,t=1,⋯,n

where are independently drawn from (1) N(0,1) and (2) Gamma(4,1/2). The design matrix is obtained in the same way as before, with . Tables 7 and 8 show the empirical power of our proposed test for different under the two error distributions.

As shown by these simulation results, the empirical size and empirical power of the portmanteau test improve as tends to infinity.

### 3.3. Parameter estimation under the null hypothesis

In practice, if the error terms are not Gaussian, we need to estimate the fourth-order cumulant to perform the test. We now give a suggested estimate under the additional assumption that the error terms are independent under the null hypothesis. Note that an unbiased estimate of variance under the null hypothesis is

 ^σ2n=γ0n−p,

and

 En∑i=1e4i =3σ4n∑i=1∑j1,j2h2ij1h2ij2+ν4σ4n∑i=1n∑j=1h4ij=3σ4tr(R∘R)+ν4σ4tr(R∘R)2.

Then, can be estimated by a consistent estimator

 ^ν4=∑ni=1e4i−3^σ4ntr(R∘R)^σ4ntr(R∘R)2.

## 4. A general joint CLT for several general quadratic forms

In this section, we establish a general joint CLT for several general quadratic forms, which helps us to find the asymptotic distribution of the statistics for testing the series correlations. We believe that the result presented below may have its own interest.

### 4.1. A brief review of random quadratic forms

Quadratic forms play an important role not only in mathematical statistics but also in many other branches of mathematics, such as number theory, differential geometry, linear algebra and differential topology. Suppose , where is a sample of size drawn from a certain standardized population. Let be a matrix. Then, is called a random quadratic form in . The random quadratic forms of normal variables, especially when is symmetric, have been considered by many authors, who have achieved fruitful results. We refer the reader to [Bartlett et al.(1960)Bartlett, Gower, and Leslie, Darroch(1961), Gart(1970), Hsu et al.(1999)Hsu, Prentice, Zhao, and Fan, Forchini(2002), Dik and De Gunst(2010), Al-Naffouri et al.(2016)Al-Naffouri, Moinuddin, Ajeeb, Hassibi, and Moustakas]. Furthermore, many authors have considered the more general situation, where follow a non-Gaussian distribution. For the properties of those types of random quadratic forms, we refer the reader to [Fox and Taqqu(1985), Cambanis et al.(1985)Cambanis, Rosinski, and Woyczynski, de Jong(1987), Gregory and Hughes(1995), Gotze and Tikhomirov(1999), Liu et al.(2009)Liu, Tang, and Zhang, Deya and Nourdin(2014), Oliveira(2016)] and the references therein.

However, few studies have considered the joint distribution of several quadratic forms. Thus, in this paper, we want to establish a general joint CLT for several random quadratic forms with general distributions.

### 4.2. Assumptions and results

To this end, suppose

 Ξ=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝ε(1)1ε(1)2⋯ε(1)nε(2)1ε(2)2⋯ε(2)n⋮⋱⋮ε(q)1ε(q)2⋯ε(q)n⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠

is a random matrix. Let be nonrandom -dimensional matrices. Define for We are interested in the asymptotic distribution, as , of the random vector , which consists of random quadratic forms. Now, we make the following assumptions.

• are standard random variables (mean zero and variance one) with uniformly bounded fourth-order moments .

• The columns of are independent.

• The spectral norms of the square matrices are uniformly bounded in .

Clearly, for , we have , and for , we obtain

 (4.1) Cov(Ql1,Ql2)=E(Ql1−EQl1)(Ql2−EQl2) = E⎛⎝n∑i=1a(l1)i,i((ε(l1)i)2−1)+∑i≠ja(l1)i,jε(l1)iε(l1)j⎞⎠⎛⎝n∑i=1a(l2)i,i((ε(l2)i)2−1)+∑i≠ja(l2)i,jε(l2)iε(l2)j⎞⎠ = n∑i=1a(l1)i,ia(l2)i,iE((ε(l1)i)2−1)((ε(l2)i)2−1)+n∑i≠ja(l1)i,ja(l2)i,jEε(l1)iε(l2)iEε(l1)jε(l2)j +n∑i≠ja(l1)i,ja(l2)j,iEε(l1)iε(l2)iEε(l1)jε(l2)j.

Let ; then, we have

 Var(Ql)=n∑i=1(M(l)4,i−3)(a(l)i,i)2+trAlA′l+trA2l.

Thus, according to assumptions , for any at most has the same order as . This result also holds for any by applying the Cauchy-Schwartz inequality. We then have the following theorem.

###### Theorem 4.1.

In addition to assumptions (a)-(c), suppose that there exists an such that has the same order as when . Then, the distribution of the random vector is asymptotically -dimensional normal.

### 4.3. Proof of Theorem 4.1

We are now in position to present the proof of the joint CLT via the method of moments. The procedure of the proof is similar to that in [Bai et al.(2018)Bai, Pan, and Yin] but is more complex since we need to establish the CLT for a -dimensional, rather than 2-dimensional, random vector. Moreover, we do not assume the underlying distribution to be symmetric and identically distributed. The proof is separated into three steps.

#### 4.3.1. Step 1: Truncation

Noting that , , for any , we have Thus, we may select a sequence such that . The convergence rate of to 0 can be made arbitrarily slow. Define to be the analogue of with replaced by , where . Then,

 P((˜Q1,˜Q2,⋯,˜Qq)≠(Q1,Q2,⋯,Qq))≤q∑i=1n∑j=1P(ε(i)j≠~ε(i)j)≤qnP(|(ε(i)j)|≥δnn1/4)→0.

Therefore, we need only to investigate the limiting distribution of the vector .

#### 4.3.2. Step 2: Centralization and Rescaling

Define to be the analogue of with replaced by . Denote by the distance between two random variables and . Additionally, denote , and We obtain that for any

 (4.2) d2(˜Ql,˘Ql) =E|(˘ε′(l)Al˘ε(l)−˜ε′(l)Al˜ε(l))|2 =E|(˘ε′(l)Al˘ε(l)−˘ε′(l)Ψ(l)AlΨ(l)˘ε(l)−E~ε′(l)Ψ(l)AlΨ(l)E~ε(l))|2 ≤2(E|˘ε′(l)Al˘ε(l)−˘ε′(l)Ψ(l)AlΨ(l)˘ε(l)|2+|E~ε′(l)Ψ(l)AlΨ(l)E~ε(l)|2) ≜2(Υl,1+Υl,2).

Noting that ’s are independent random variables with 0 means and unit variances, it follows that

 Υl,1 =E(˘ε′(l)Al˘ε(l)−˘ε′(l)Ψ(l)AlΨ(l)˘ε(l))2=E(˘ε′(l)(Al−Ψ(l)AlΨ(l))˘ε(l))2 =ν4tr((Al−Ψ(l)AlΨ(l))∘(Al−Ψ(l)AlΨ(l)))+tr(Al−Ψ(l)AlΨ(l))2 +tr((Al−Ψ(l)AlΨ(l))(Al−Ψ(l)AlΨ(l))′).

Since and

 1−E(~ε(l)j)2=E(ε(l)j)2I(|ε(l)j|≥δnn1/4)≤Cδ−2nn−1/2,

we know that

 (4.3) ∥(I−Ψ(l))∥=maxj=1,⋯,n|1−√Var~ε(l)j|≤maxj=1,⋯,n√|1−Var~ε(l)j|≤Cδ−1nn−1/4.

Then, we have

 ∥(Al−Ψ(l)AlΨ(l))∥≤∥Al∥∥(I−Ψ(l))∥+∥(I−Ψ(l))∥∥Al∥∥Ψ(l)∥=O(δ−1nn−1/4).

It follows that and By combining the above estimates, we obtain that for

Noting that the entries in the covariance matrix of the random vector have at most the same order as , we conclude that