be a sequence of identically distributed random variables (rvs), and denote bythe common univariate marginal distribution function. For any , set . For simplicity, we set , that is . The rv is a record if . Such an event is coded by the indicator function . When are independent, many results on records are already known (e.g., gal87; arnbn98; [Ch. 4]resn08; barakat2017; falkkp2018). In the multivariate case various definitions of records are possible and have been investigated both in the past and more recently, see e.g., golres89, hashhue05, hwang2010, domfalzot18 to name a few. In this work we consider complete records
; these are random vectors which are univariate records in each component. Precisely, letbe a strictly stationary sequence of -dimensional random vectors (rvs) . Let be the common joint distribution function of with margins , . The rv is a complete record if
where the maximum is computed componentwise. We denote the rv coding the occurrence of a complete record at time by .
Except for haiman1987, haiman1998, as far as we know, most of the available results on records concern sequences of independent random variables or vectors. In the present work we derive some new results on the records of a stationary sequence of dependent random variables and dependent random vectors, under appropriate conditions of the dependence structure.
At first we consider a univariate second-order stationary Gaussian process with zero-mean, unit-variance. This means that for every , , and the autocovariance of the process is translation-invariant depending only on the time difference, i.e. for every , , where is a function only of the separation and for every , . We derive the probability that a record at time , say , takes place, and the distribution of , being a record. Furthermore, we derive the joint distribution of the arrival time process of records and more specifically the distribution of the increments between the first and second record and the third and second record. We compute the expected number of records which, depending on the type of correlation structure of the Gaussian process, can be finite or infinite. We also focus on joint records and we derive the probability that two consecutive and non-consecutive records at the time and , say and , take place, as well as the joint distribution of , considering they are both records.
We highlight that many of our findings, such as the probability that the records and take place and the arrival time of the -th record, are independent of the marginal distribution function , provided that is is continuous. As a consequence, the results actually hold for second-order stationary sequences with Gaussian copulas. On the contrary the distribution of a record (two records), conditional to the assumption that it is a record (they are records), however does depend on .
Next we consider a strictly stationary process satisfying some mild conditions on the tail behavior of the common marginal distribution function and the long-range dependence of the extremes of the process. More specifically, it is assumed that is attracted by the so-called Generalized Extreme-Value family of distributions, and that maxima on separated enough intervals within the time span are approximately independent. Within this setting we derive the probability that is a record, the distribution of (being a record), and the expected number of records.
We complete the work by considering a zero-mean, unit-variance multivariate second-order stationary Gaussian process. We derive the probability that a complete record at time occurs, and we compute the distribution of (being a record), as well as the probability that two complete records at the time and occur, and the joint distribution of (being records).
The paper is organized as follows. In Section 2.1 we introduce some notation used throughout the paper and we briefly review some basic concepts on the multivariate closed skew-normal distribution. In Section 2.2 we present our main results on records for an univariate second-order stationary Gaussian process. In Section 2.3 we provide the asymptotic probability and distribution function of a record at time for a strictly stationary process that satisfies some appropriate conditions. Finally, in Section 3 we extend some of the results derived in Section 2.3 to the case of multivariate second-order stationary Gaussian processes.
2 Univariate Case
2.1 Preliminary results and notation
Throughout the paper we use the following notation. The symbol , , means an
-dimensional random vector that follows a multivariate Gaussian distribution with meanand positive-definite covariance matrix , and and with . When and , where
is the identity matrix, we writefor simplicity.
We indicate with () a matrix of dimension whose elements are all equal to one (zero). We omit the subscripts when the dimensions of the matrices are clear from the context.
We introduce the notion of a multivariate closed skew-normal (CSN) random vector and we do so by using the so-called conditioning representation ([Ch. 2]genton2004). Let being independent of , where , and . Let , then
where . Define equal to , under the condition that , denoted by , where . The -dimensional random vector follows a multivariate closed skew-normal distribution, in symbols , whose pdf is, for all ,
We denote the cdf of by . When , and , we omit them among the parameters for simplicity and we write and instead. We recall that the closed skew-normal distribution is also known in the literature as the unified multivariate skew-normal distribution, which simply uses a different parametrization (e.g, Ch. 7.1.2 in azzalini2013skew). The exposition of our results benefits from the parametrization used by the closed skew-normal distribution.
We recall that if then
see azzalini2010. Furthermore, for and then,
where , and , (see Ch. 2 in genton2004 for details).
2.2 Records of dependent univariate Gaussian sequences
Let be a second-order stationary Gaussian sequence of dependent rvs. Without loss of generality, assume for simplicity that , for every . Throughout the paper we will refer to such a process as a stationary standard Gaussian (SSG) sequence. For any , let and identify the -dimensional and -dimensional subvector partition such that , with corresponding partition of the parameter . By we denote the number of elements of a set .
Our results rely on the following well-known important result on the conditional distribution derived from joint Gaussian distribution. Precisely, let with corresponding partition of the parameters and , then in [Theorem 2.5.1]ander84 it is established that the conditional distribution of given that , is for all ,
Furthermore, we denote the related correlation matrix by
where . For any , when we simplify the notation writing and . When or we further simplify the notation by and .
In our first result we compute the probability that is a record together with its distribution. It is well known that in the case of independent rv with identical continuous df (see e.g., gal87) and that the distribution of , given that it is a record, equals that of the largest observation among falkkp2018.
Let be a SSG sequence of rvs. For every , let , . Then, the probability that is a record and the distribution of , given that it is a record, are equal to
where is a variance-covariance matrix whose entries of the associated correlation matrix are
and is a correlation matrix with entries
The probability that is a record is
To obtain the second line we used the formula in (5), which leads to , where , and this can be seen as independent of . From the third to fourth row we used Lemma 7.1 in azzalini1996multivariate. With similar steps, we obtain the distribution for the record ,
be the arrival time of the -th record.
Let be the arrival time process of records. Let where and . Set . Then,
where is given in (12). By standardizing the random vector , we obtain
In the next result we establish the distribution of the arrival time of the second record as well as that of the increment .
Let be a SSG sequence of rvs. Let with . Assume that for , as and as . For , the distribution of the arrival time of the second record is
where and are defined similarly to (7). Furthermore, for every , the distribution of the increment is
where is an -dimensional vector.
When we have
For we have
Let be zero-mean unit-variance Gaussian sequence with variance-covariance matrix . Set . Clearly . We recall that for every . By the Fréchet inequalities we have that
For we derive the following upper bound . Precisely,
where and where is a bivariate Gaussian cdf with correlation that is given in (6). In the third row we used the Chebyshev’s inequality. Set we rewrite as
Now, when we obtain and therefore and as a consequence the term as . We rewrite the term as