1. Introduction and Preliminaries
Let be a random vector (rv), whose distribution function (df) is in the domain of attraction of a multivariate non degenerate df , denoted by , i.e., there exist vectors , , , such that
All operations on vectors such as , etc. are meant componentwise.
The limit df is necessarily max-stable, i.e., there exist vectors , , , such that
A characterization of multivariate max-stable df was established by de Haan and Resnick (1977) and Vatan (1985); for an introduction to multivariate extreme value theory see, e.g., Falk et al. (2011, Chapter 4).
The univariate margins , , of a multivariate max-stable df belong necessarily to the family of univariate max-stable df, which is a parametric family with
being the family of reverse Weibull, Fréchet and Gumbel distributions. Note that , , is the standard negative exponential df. We refer, e.g., to Galambos (1987, Section 2.3) or Resnick (1987, Chapter 1).
follows the uniform distribution on, such that
where is the df of and ,
, is the common generalized inverse or quantile function of, . By we denote equality in distribution.
The rv , therefore, follows a copula, say . If is continuous, then the copula is uniquely determined and given by , .
The df satisfies iff this is true for the univariate margins of together with the convergence of the copulas:
, where denotes the -th margin of , .
Let be independent copies of the rv , which follows the copula . Then the copula of
is , where the maximum is also taken componentwise. The df of is and, thus, we have
Therefore, condition (3) actually means pointwise convergence of the copulas
where , , is the copula of . This is an extreme value copula. Note that each margin of is continuous, which is equivalent with the continuity of (see, e.g., Reiss (1989, Lemma 2.2.6)).
Elementary arguments imply that condition (3) is equivalent with the condition
where and , , defines a max-stable df with standard negative exponential margins , , . Such a max-stable df will be called a standard one, abbreviated by SMS (standard max-stable).
While the condition on the univariate margins in Theorem 1.1 addresses univariate extreme value theory, condition (3) on the copula means by the equivalent condition (4) that the copula is in the domain of attraction of a multivariate SMS df:
Let be an arbitrary copula on . Then condition (1) becomes
where the norming constants are determined by the univariate margins of , i.e., the uniform distribution on : With , we obtain for large
We therefore obtain the conclusion: If a copula satisfies , then the limiting df has necessarily standard negative exponential margins:
i.e., the limiting df is necessarily a SMS df.
As a consequence we obtain that multivariate extreme value theory actually means extreme value theory for copulas.
This paper is organized as follows. In the next section we introduce -norms, which turn out to be a common thread in multivariate extreme value theory. Using the concept of -norms, we introduce in Section 3 generalized Pareto copulas (GPC). The characteristic property of a GPC is its excursion or exceedance stability, established in Theorem 4.1. The family of GPC together with the well-known set of univariate generalized Pareto distributions (GPD) enables the definition of multivariate GPD in Section 5. As the set of univariate GPD equals the set of univariate non degenerate exceedance stable distributions, its extension to higher dimensions via a GPC and GPD margins is an obvious idea. -neighborhoods of a GPC are introduced in Section 6. The normal copula is a prominent example. Among others we show how to simulate data, which follow a copula from such a -neighborhood. In Section 7 we show how our findings on GPC can be used to estimate exceedance probabilities above high thresholds, including confidence intervals. A case study in Section 8 on joint exceedance probabilities for air pollutants such as ozone, nitrogen dioxide, nitrogen oxide, sulphur dioxide and particulate matter, completes the paper.
2. Introducing D-Norms
Theorem 2.1 (Balkema and Resnick (1977), de Haan and Resnick (1977), Pickands (1981), Vatan (1985)).
A df on is an SMS df iff there exists a norm on such that
Elementary arguments imply the following consequence.
A copula satisfies iff there exists a norm on such that
as , uniformly for .
Such a norm is called -norm, with generator . The additional index means dependence. -norms were first mentioned in Falk et al. (2004, equation (4.25)) and more elaborated in Falk et al. (2011, Section 4.4). Examples are:
, with generator ,
, with generator being a random permutation of the vector ,
each logistic norm , , with generator , iid Fréchet-distributed rv with parameter , where denotes the usual gamma function.
Let the rv
follow a multivariate normal distribution with mean vector zero, i.e.,, , and covariance matrix . Then
follows a log-normal distribution with mean, , and, thus,
is the generator of a -norm, called Hüsler-Reiss -norm. This norm only depends on the covariance matrix and, therefore, it is denoted by .
The generator of a -norm is in general not uniquely determined, even its distribution is not. Take, for example, any rv with . Then generates the sup-norm . An account of the theory of -norms is provided by Falk (2019).
3. Generalized Pareto Copulas
Corollary 2.2 stimulates the following idea. Choose an arbitrary -norm on and put with
Each univariate margin of , defined this way, satisfies for
i.e., each is the uniform df on . But does in general not define a df, see, e.g., Falk et al. (2011, Proposition 5.1.3). We require, therefore, the expansion
only for close to , i.e., for with some . A copula with this property will be called a generalized Pareto copula (GPC). These copulas were introduced in Aulbach et al. (2012); tests, whether data are generated by a copula in a -neighborhood of a GPC were derived in Aulbach et al. (2018), see Section 6 for the precise definition of this neighborhood. The multivariate generalized Pareto distributions defined in Section 5 show that GPC actually exist for any -norm . The corresponding construction of a generalized Pareto distributed rv also provides a way to simulate data from an arbitrary GPC.
As a consequence, an arbitrary copula satisfies the following equivalences
In this case we have , .
Take an arbitrary Archimedean copula on
where is a continuous and strictly decreasing function from to such that (see, e.g., McNeil and Nešlehová (2009, Theorem 2.2)). Suppose that
It follows from Charpentier and Segers (2009, Theorem 4.1) that is in its upper tail close to the GPC with corresponding logistic -norm .
Suppose that the generator function satisfies with some
with . Then is a GPC, precisely,
This is readily seen as follows. Condition (8) is equivalent with the equation
Integrating both sides implies
with . But this yields
4. Characterization of a GPC
Next we derive the characteristic property of a GPC. Suppose the rv follows a GPC . Then its survival function equals
is the dual -norm function pertaining to with generator , see the proof of Theorem 4.1. Using the equations (4) below it is straightforward to prove that does not depend on the particular choice of the generator of . We have, for example,
Note that the mapping is not one-to-one, i.e., two different -norms can have identical dual -norm functions.
The function is obviously homogeneous:
As a consequence, a GPC is excursion stable:
for close to , provided .
Note that each marginal distribution of a GPC is a lower dimensional GPC as well: If the rv follows the GPC on , then the rv follows a GPC on , for each nonempty subset . We have
for close to , where denotes the -th unit vector in , .
The characteristic property of a GPC is its excursion stability, as formulated in the next result.
Let the rv follow a copula . Then is a GPC iff for each nonempty subset of the rv is exceedance stable, i.e.,
for close to .
The implication “” in the preceding result is just a reformulation of Falk and Guillou (2008, Proposition 6). The conclusion “” can be seen as follows. We can assume without loss of generality that .
Using induction, it is easy to see that arbitrary numbers satisfy the equations
By choosing , the preceding equations imply in particular
The inclusion-exclusion principle implies for close to
Choose a generator of . From equation (4) we obtain
Replacing by yields the assertion. ∎
If , then (9) clearly becomes
But can be equal to zero for all close to . This is for example the case, when the underlying -norm is . Then , and, thus, for all close to , unless .
While the characteristic property of a GPC is its excursion stability, the characteristic property of an extreme value copula , , which corresponds to a max-stable df , is its max-stability, defined below. By transforming the univariate margins to the standard negative distribution, we can assume without loss of generality that is an SMS df. In this case we have , , and, thus, we obtain the representation of the copula of an arbitrary max-stable df
with some -norm . For a discussion of parametric families of extreme value copulas and their statistical analysis we refer to Genest and Nešlehová (2012).
Equation (12) obviously implies the max-stability of an extreme value copula :
If, on the other hand, an arbitrary copula satisfies equation (13), then it is clearly the copula of a SMS df . As a consequence, we have two stabilities of copulas: max-stability and exceedance stability.
Let be an arbitrary copula on . The considerations in this section show that the copula of converges point-wise to a max-stable copula if, and only if, is in its upper tail close to that of an excursion stable copula, i.e., to that of a GPC.
The message of the considerations in this section is: If one wants to model the copula of multivariate exceedances above high thresholds, then a GPC is a first option.
5. Multivariate Generalized Pareto Distributions
Let be the set of univariate max-stable df as defined by the equations above and in (2). The family of univariate generalized Pareto distributions (GPD) is the family of univariate excursion stable distributions:
Suppose the rv follows the df . Then
For a threshold and an , the univariate GPD takes the form of the following scale and shape family of distributions
where and (e.g. Falk et al., 2011, page 35).
The definition of a multivariate GPD is, however, not unique in the literature. There are different approaches (Rootzén and Tajvidi (2006), Falk et al. (2011)), each one trying to catch the excursion stability of a multivariate rv. The following suggestion might conclude this debate. Clearly, the excursion stability of a rv should be satisfied by its margins and its copula. This is reflected in the following definition.
A rv follows a multivariate GPD, if each component follows a univariate GPD (at least in its upper tail), and if the copula corresponding to is a GPC, i.e., there exists a -norm on and such that
As a consequence, each such rv , which follows a multivariate GPD, is exceedance stable and vice versa.
The following construction extends the bivariate approach proposed by Buishand et al. (2008) to arbitrary dimension. It provides a rv, which follows an arbitrary multivariate GPD as in Definition 5.1. Let be the generator of a -norm , with the additional property that each , for some . Note that such a generator exists for an arbitrary -norm according to the normed generators theorem for -norms (Falk (2019)). Let the rv be uniformly on distributed and independent of . Put
Then, for each ,
i.e., follows in its upper tail a univariate standard Pareto distribution, and, by elementary computation, we have
The preceding equation implies that the copula of is a GPC with corresponding -norm . The rv can be seen as a prototype of a rv, which follows a multivariate GPD. This GPD is commonly called simple.
Choose as in equation (15) and numbers . Then
follows a general multivariate GPD with margins in its univariate upper tails.
With the particular choice we obtain a standard multivariate GPD
Its df is
for , close enough to zero.
With the particular choice we obtain a multivariate GPD with Gumbel margins in the upper tails
follows the standard exponential distribution on.
6. -Neighborhoods of GPC
A major problem with the construction in (15) is the additional boundedness condition on the generator . This is, for example, not given in case of the logistic -norm with or the Hüsler-Reiss -norm. From the normed generators theorem in Falk (2019) we know that bounded generators exist, but, to the best of our knowledge, they are unknown in both cases.
In this section we drop this boundedness condition and show that the construction (15) provides a copula, which is in a particular neighborhood of a GPC, called -neighborhood. We are going to define this neighborhood next.
Denote by the unit sphere in with respect to the norm , . Choose an arbitrary copula on and put for
Then is a univariate df on , and the copula is obviously determined by the family
of univariate spectral df . The family is the spectral decomposition of ; cf Falk et al. (2011, Section 5.4). A copula is, consequently, in iff its spectral decomposition satisfies
as . The copula is by definition in the -neighborhood of the GPC with -norm if their upper tails are close to one another, precisely, if
as , uniformly for . In this case we know from Falk et al. (2011, Theorem 5.5.5) that
Under additional differentiability conditions on with respect to , also the reverse implication holds; cf. Falk et al. (2011, Theorem 5.5.5). Therefore, the -neighborhood of a GPC, roughly, collects those copula with a polynomial rate of convergence for maxima.
Condition (6) can also be formulated in the following way:
as , uniformly for , where is an arbitrary norm on .
Choose and put for
With , this is the fragility index, introduced by Geluk et al. (2007) to measure the stability of the stochastic system . The system is called stable if is close to one, otherwise it is called fragile. The asymptotic distribution of , given , was investigated in Falk and Tichy (2011); Falk and Tichy (2012).
If follows a GPC with corresponding -norm , we obtain for close enough to zero
implies that there is a least favorable direction with
A vector with , , maximizes the fragility index. For arbitrary and , , one obtains for example with constant entry and
If follows a copula, which is in a -neighborhood of a GPC with -norm , then we obtain the representation
If we replace for example by , where , , is the standard Pareto df, then we obtain for the fragility index