## 1 Introduction

In recent decades, techniques using high-order statistics (HOS) (Nikias and Petropulu, 1993; Haykin, 2000; Cichocki and Amari, 2002; Comon and Jutten, 2010) have grown considerably. The reason is that some problems were not solved by using simple techniques based on first and second order statistics only. Therefore, it is important to know whether or not HOS are providing information in a given data set. On the other hand, normality tests are also of interest in their own right; they can, for example, be used to detect abrupt changes in dynamical systems (Basseville and Nikiforov, 1993).

Let be a real -variate stochastic process. We assume that a sample of finite size is observed , . Our goal is to implement the following test *without alternative*, in such a way that it can be executed in real-time, e.g. over a sliding window.

Problem P1: Given a finite sample of size , :

(1) where variables are identically distributed, but not statistically independent.

This well-known problem is twofold: (i) define a test variable, and (ii) determine its asymptotic distribution (often itself normal) in order to assess the power of the test, that is, the probability to decide

whereas is true.One should distinguish between *scalar* and *multivariate* tests, the latter addressing joint normality of several variables. Since the so-called Chi-squared test proposed by Fisher and improved in (Moore, 1971)

, the most popular scalar test is probably the omnibus test based on skewness and kurtosis by

(Bowman and Shenton, 1975). The omnibus test proposed by (Bowman and Shenton, 1975)combines estimated skewness

and kurtosisweighted by the inverse of their respective asymptotic variance:

(2) |

where , , is the sample mean and , for . The asymptotic variance of and is indeed and under the assumption that samples are independently and identically distributed (i.i.d.) and normal; see (Mardia, 1974), (Kotz and Johnson, 1982). The asymptotic distribution of the test is when samples are i.i.d normal. However, as pointed out by (Moore, 1982), the Chi-square test is very sensitive to the dependence between samples; the process color yields a loss in apparent normality (Gasser, 1975).

Actually, most of the tests proposed in the literature assume that observations are i.i.d., see (Shapiro et al., 1968) or (Pearson et al., 1977). This is also true for multivariate tests (Mardia, 1970; Andrews et al., 1973); see the survey of (Henze, 2002).

Even if samples are often correlated in practice, few tests are dedicated to colored processes. For instance, the linearity test of (Hinich, 1982) can serve as a normality test; it is indeed based on the bispectrum, is constant if the process is linear, and that constant is null in the Gaussian case. One could build a similar test based on the trispectrum, since estimated multispectra of higher order are also asymptotically normal (Brillinger, 1981).

In practice, nonlinear functions applied to can go beyond monomials of degree 3 or 4 (Moulines et al., 1992)

. For instance, some tests are based on the characteristic function

(Epps, 1987; Moulines et al., 1992) and others on entropy (Steinberg and Zeitouni, 1992). Theses tests are complex to implement in practice.Except tests based on arbitrary 1D projections (Mardia, 1970; Malkovich and Afifi, 1973; Nieto-Reyes et al., 2014), which we shall discuss later, all the tests we have reviewed above are hardly executable in real-time on a light processor, as soon as they are valid for statistically dependent samples. For this reason, we shall focus on the multivariate kurtosis proposed in (Mardia, 1970), and derive its mean and variance when samples are assumed to be statistically dependent.

When deriving theoretical properties in the remainder, it is supposed that

is a zero-mean stationary process, with finite moments up to order 16. Its covariance matrix function is denoted by

(3) |

For the sake of conciseness, will be merely denoted by . In addition, we assume the following mixing condition upon : converges to a finite limit , , where denote the entries of matrix .

##### Contribution

Our main contributions are the following.
We provide a Multivariate test for Gaussianity, which can be implemented in real-time, as most of the conventional ones are univariate. We could use univariate tests on each of the components, but this would not test for *joint normality* and can lead to misdetections; this fact is subsequently illustrated with copula.
A general procedure is provided to compute the asymptotic mean and variance of Mardia’s Multivariate Kurtosis when samples are *statistically dependent*.
Then we provide the complete expressions of mean and variance of the test variable in the general case when is of dimension , which allows to test joint normality if two arbitray projections are performed in a first stage, in the same spirit as done in (Malkovich and Afifi, 1973) in the i.i.d. case. These results are summarized in Section 7.
Additionally, the particular case when has the form is addressed, where is a scalar colored process.

This article is organized as follows. Section 2

contains the definition of the test statistic, followed by Section

3 where necessary tools are introduced to conduct the calculations. The moments involved in deriving both the mean and variance of the test statistic are given in sections 4-5, then their exact expressions for various cases are given in sections 6-8. Section 9 reports some computer experiments. We defer the expressions of the moments and details about the computation to appendices in Section 11.## 2 Multivariate kurtosis

The test proposed in (Mardia, 1970) takes the form:

(4) |

For , one can show that . Its sample counterpart for a sample of size is:

(5) |

One advantage of this test variable is that it is invariant with respect to linear transformations, i.e.,

. In practice, the covariance matrix is unknown and is replaced by its sample estimate, , so that we end up with the following test variable:(6) |

with

The multivariate normality test can be formulated in terms of the multivariate Kurtosis: the variable is said to be normal if , where is a threshold to be determined as a function of the power of the test. The fact that is a good estimate of or not is relevant; what is important is to have a sufficiently accurate estimation of the power of the test. In order to do that, we need to assess the mean and variance of under . Under the assumption that are i.i.d. realizations of variable , the mean and variance of have been calculated:

###### Theorem 2.1

(Mardia, 1970) Let be i.i.d. of dimension . Then under the null hypothesis , is asymptotically normal, with mean and variance .

Our purpose is now to state a similar theorem when are not independent. Since this involves heavy calculations, we need to introduce some tools to make them possible.

## 3 Statistical and combinatorial tools

### 3.1 Lemmas

The estimated multivariate kurtosis (6) is a rational function of degree 4. Since we wish to calculate its asymptotic first and second order moments, when tends to infinity, we may expand this rational function about its mean. The first step is to expand the estimated covariance . Let , where is small compared to ; in fact :

###### Lemma 3.1

The entries of matrix are of order .

Proof. Under Hypothesis , the covariance of entries take the form below :

and letting , and we have after some manipulation:

Next, using the inequalities , we have:

Now using the mixing condition, , we eventually obtain:

(7) |

which shows that .

If we denote by and the inverse of and , respectively, we have the lemma below.

###### Lemma 3.2

The inverse of can be approximated by

(8) |

Proof. Let be the symmetric matrix . Then with this definition, . Now we know that for any matrix with spectral radius smaller than 1 the series converges to . If we plug this series in the expression of we get . Replacing by its definition eventually yields (8).

Now it is desirable to express as a function of . If we replace by in (8), we obtain:

(9) |

With this approximation, is now a polynomial function of of degree 2, and hence of degree 4 in . We shall show that the mean of involves moments of up to order 8, whereas its variance involves moments up to order 16.

###### Lemma 3.3

Denote . Then:

(10) |

### 3.2 Additional notations and computational issues

When computing the mean and variance of given in (10

), higher order moments of the multivariate random variable

will arise. Under the normal (null) hypothesis, these moments are expressed as functions of second order moments only. To keep notations reasonably concise, it is proposed to use McCullagh’s bracket notation (Mccullagh, 1987), briefly reminded in Appendix 11.1. Furthermore, for all moments of order higher than , some components appear multiple times; counting the number of identical terms in the expansion of the higher moments is a tedious task. All the moment expansions that are necessary for the derivations presented in this paper are developed in Appendix 11.3.In order to keep notations as explicit and concise as possible, while keeping explicit the role of both coordinate (or space) indices and time indices, let the moments of , whose components are , be noted

(11) |

and so forth for higher orders. It shall be emphasized that different time and coordinate indices appear here as the components are assumed to be colored (time correlated) and dependent to each others (spatially correlated).

Computation of the mean and variance of defined by equation (10) involves the computation of moments of order noted whose generic expression is

or equivalently

(12) |

In the above equation, the -order moment has superscripts indicating the time indices involved, whereas the subscripts indicate the coordinate (or space) indices.

While being general, the above formulation may take simpler, or more explicit forms in practice. The detailed methodology for computing the expressions of the mean and variance of as functions of second order moments is deferred to Appendix 11.2. The resulting expressions of Mardia’s statistics are given and discussed in the sections to come.

## 4 Expression of the mean of

According to Equation (10), we have four types of terms. The goal of this section is to provide the expectation of each of these terms.

###### Lemma 4.1

With the definition of given in Lemma 3.3, we have:

(13) | |||||

(14) | |||||

(15) | |||||

(16) |

###### Proposition 4.1

The mean of then follows from (10).

## 5 Expression of the variance of

From Lemma 3.3, we can also state what moments of will be required in the expression of the variance of .

###### Lemma 5.1

Then, as in Proposition 4.1, by using the results of Appendix 11.3, the moments could be in turn expressed as a function of second order moments. For readability, we do not substitute here these values.

###### Proposition 5.1

(31) | |||||

## 6 Mean and variance of in the scalar case

## 7 Mean and variance of in the bivariate case

In the bivariate case, expressions become immediately more complicated, but we can still write them explicitly, as reported below. We remind that .

(34) |

with

(35) | |||||

< |

Comments

There are no comments yet.