Recent developments in massive data processing lead us to think in a different way about certain problems in Statistics. In particular, it is of interest to develop the construction of statistics as functions of data blocs and to study their inference. On the other hand, very often, in some applications (e.g., in extremes Davis, R.A. and Mikosch, T. (2008) and in astronomy Long, J.P. and De Sousa, R.S. (2018)) only very little data is relevant for the estimates, without forgetting that this is also hidden among a large mass of “raw data”. This brings us to the idea of thinking about clusters of data deemed “relevant” (or type extremal, in the context of extreme value theory), where we say that two relevant values belong to two different clusters if they belong to two different blocks. Moreover, these relevant values are in the cores of the blocks, where the core of a block is defined as the smaller sub-block of that contains all the relevant values of , if they exist.
In the context of this work, we consider functionals which act on these clusters of relevant values and we develop useful lemmas in order to simplify the essential step to establish a Lindeberg central limit theorem for these “cluster functionals” on stationary random fields, inspired by the works of Bardet, J-M., Doukhan, P., Lang, G. and Ragache, N. (2007), Drees, H. and Rootzén, H. (2010) and Gómez-García, J.G. (2018).
Precisely, let and let us denote , and , where . Let be a valued stationary random field and let be the corresponding normalized random observations from the random field , defined by for some measurable functions , such that
where is a non-degenerate distribution and is the relevance set. Here, denotes the usual indicator function of a subset and the tendency means that for all . In particular, the convergence (1
) is fulfilled if the random vectoris regularly varying. For details about regularly varying vectors one can refer to Resnick Resnick, S.I. (1986, 1987).
For each , let be a integer value such that and . We define the blocks (or simply blocks) of by
where . We have thus complete blocks , and no more than incomplete ones which we will ignore. Besides, as usual, denotes the Cartesian product and, by stationarity, we will denote as a generic block of .
We are now going to formally define the core of a block, cluster functional and the empirical process of cluster functionals, which are generalizations of the definitions of Drees, H. and Rootzén, H. (2010) to blocks.
Let be a block. The core of the block (w.r.t. the relevance set ) is defined as
where, for each , and are defined as
Let be a measurable subspace of for some such that . Let be the set of valued blocks (or arrays) of size , with . Consider now the set
which is equipped with the field induced by the Borelfields on , for . A cluster functional is a measurable map such that
Let be a class of cluster functionals and let be the family of blocks of size defined in (2). The empirical process of cluster functionals in , is the process defined by
where and with denoting the relevance set.
Under the Lindeberg condition and the convergence to zero of a sequence that summarizes the dependence between the blocks of values of the random field, we prove that the finite-dimensional marginal distributions (fidis) of the empirical process (4) converge to a Gaussian process. The proof basically consists of the “Lindeberg method” as in Bardet, J-M., Doukhan, P., Lang, G. and Ragache, N. (2007), but adapted here to stationary random fields.
Regarding the condition , as , this can be fulfilled if the random field has short range dependence properties, e.g., if the random field is weakly dependent in the sense of Doukhan & Louhichi Doukhan, P. and Louhichi, S. (1999) under convenient conditions for the decay rates of the weak-dependence coefficients. These rates are calculated in Gómez-García, J.G. (2018) in the context of extreme clusters of time series.
The rest of the paper consists of two sections. In Section 2, we provide useful lemmas in order to establish the central limit theorem for the fidis of the cluster functionals empirical process (4). In Section 3 we introduce the iso-extremogram (a correlogram for extreme values of space-time processes) and we use the CLT of Section 2 in order to show that, under additional suitable conditions, the iso-extremogram estimator has asymptotically a Gaussian behavior.
In this section we provide useful lemmas that simplify notably the essential step to establish a central limit theorem for the fidis of the empirical process defined in (4). The proof consists in the same techniques that Bardet et al. Bardet, J-M., Doukhan, P., Lang, G. and Ragache, N. (2007) used in the demonstrations of their dependent and independent Lindeberg lemmas, but generalized here to random fields.
In order to establish the CLT, firstly consider the following basic assumption:
The vector is such that for each .
Besides, denoting , and , as .
Secondly, consider the following essential convergence assumptions:
, , ;
Consider now the random blocks , with defined in (2). For each tuple of cluster functionals and each , we define the random vector:
Without loss of generality and in order to simplify writing, we will consider in the rest of this section.
Let be a sequence of zero mean independent
-valued random variables, independents of the sequence, such that , for all . Denote by the set of bounded functions with bounded and continuous partial derivatives up to order . For and , define
The following assumption will allow us to present, in a useful and simplified form, lemmas of Lindeberg under independence and dependence.
It exists such that, for all , for all and all tuple of cluster functionals . Moreover, denote
Lemma 1 (Lindeberg under independence).
Suppose that the blocks are independents and that the random variables defined in (5) satisfy Assumption (Lin’). Then, for all :
Proof. First, notice that
Besides, we set the convention , if either or .
Now, we will use some lines of the proof of Lemma 1 in Bardet, J-M., Doukhan, P., Lang, G. and Ragache, N. (2007).
Let . From Taylor’s formula, there exist vectors such that:
where, for , stands for the value of the symmetric linear form from of at . Moreover, denote
Thus, for , there exist some suitable vectors such that
by using the approximation of Taylor of order , and
by using the approximation of Taylor of order .
where (2) is given by using the inequality , with and .
Substituting and for and in the preceding inequality (2) and taking expectations, we will obtain a bound for . Indeed,
because is independent of and , and because and for all .
On the other hand, using Jensen’s inequality, we derive , and because is a Gaussian random variable with the same covariance as .
Besides, for ,
because, for all , .
As a consequence, from Assumption (Lin’), .
The proof of this remark for general independent random vectors is given in (Bardet, J-M., Doukhan, P., Lang, G. and Ragache, N., 2007, p.165).
Observe that the assumptions (Lin) and (Cov) imply that and that , respectively. Therefore, if the blocks are independent and if the assumptions (Lin) and (Cov) hold, then from Lemma 1 and Remark 2.1, the fidis of the empirical process of cluster functionals converge to the fidis of a Gaussian process with covariance function .
For the dependent case, we need to consider more notations:
Let , for all . We set for any and any . For each , , and ; we define
Lemma 2 (Dependent Lindeberg lemma).
Suppose that the r.v.’s defined in (5) satisfy Assumption (Lin’). Consider the special case of complex exponential functions with . Then, for each and each tuple of cluster functionals, the following inequality holds:
Consider an array of independent random variables satisfying Assumption (Lin’) and such that is independent of and . Moreover, assume that has the same distribution as for .
Then, using the same decomposition (7) in the proof of the previous lemma, one can also write,
Then, from the previous lemma, the second term of the RHS of the inequality (14) is bounded by
For the first term of the RHS of the inequality (14), first notice that for a valued random vector independent from ,
because , where is the covariance matrix of the vector , for . For or , recall that . In this case, we also set .
The previous lemma together with Remark 2.1 derive the following theorem.
Theorem 3 (CLT for cluster functionals on random fields).
Suppose that the basic assumption (Bas) holds and that the assumptions (Lin) and (Cov) are satisfied. Then, if for each , converges to zero as , for all and all tuple of cluster functionals, the fidis of the empirical process of cluster functionals converge to the fidis of a Gaussian process with covariance function defined in (Cov).
Proof. The assumptions (Lin) and (Cov) imply that, as , and