 # Generalized Pareto Copulas: A Key to Multivariate Extremes

This paper reviews generalized Pareto copulas (GPC), which turn out to be a key to multivariate extreme value theory. Any GPC can be represented in an easy analytic way using a particular type of norm on R^d, called D-norm. The characteristic property of a GPC is its exceedance stability. GPC might help to end the debate: What is a multivariate generalized Pareto distribution? We present an easy way how to simulate data from an arbitrary GPC and, thus, from an arbitrary generalized Pareto distribution. As an application we derive nonparametric estimates of the probability that a random vector, which follows a GPC, exceeds a high threshold, together with confidence intervals. A case study on joint exceedance probabilities for air pollutants completes the paper.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction and Preliminaries

Let be a random vector (rv), whose distribution function (df) is in the domain of attraction of a multivariate non degenerate df , denoted by , i.e., there exist vectors , , , such that

 (1) Fn(anx+bn)→n→∞G(x),x∈Rd.

All operations on vectors such as , etc. are meant componentwise.

The limit df is necessarily max-stable, i.e., there exist vectors , , , such that

 Gn(anx+bn)=G(x),x∈Rd.

A characterization of multivariate max-stable df was established by de Haan and Resnick (1977) and Vatan (1985); for an introduction to multivariate extreme value theory see, e.g., Falk et al. (2011, Chapter 4).

The univariate margins , , of a multivariate max-stable df belong necessarily to the family of univariate max-stable df, which is a parametric family with

and

 (2) G0(x):=exp(−e−x),x∈R,

being the family of reverse Weibull, Fréchet and Gumbel distributions. Note that , , is the standard negative exponential df. We refer, e.g., to Galambos (1987, Section 2.3) or Resnick (1987, Chapter 1).

By Sklar’s theorem (Sklar (1959, 1996)), there exists a rv with the property that each component

follows the uniform distribution on

, such that

 X=D(F−11(U1),…,F−1d(Ud)),

where is the df of and ,

, is the common generalized inverse or quantile function of

, . By we denote equality in distribution.

The rv , therefore, follows a copula, say . If is continuous, then the copula is uniquely determined and given by , .

Deheuvels (1984) and Galambos (1987) showed that iff this is true for each univariate margin and for the copula . Precisely, they established the following result.

###### Theorem 1.1 (Deheuvels (1984), Galambos (1987)).

The df satisfies iff this is true for the univariate margins of together with the convergence of the copulas:

 (3) CnF(u1/n)→n→∞CG(u)=G((G−1i(ui))di=1),

, where denotes the -th margin of , .

Let be independent copies of the rv , which follows the copula . Then the copula of

 Mn:=max1≤i≤nU(i)

is , where the maximum is also taken componentwise. The df of is and, thus, we have

 CnF(u1/n)=CMn(u)=CCnF(u),u∈[0,1]d.

Therefore, condition (3) actually means pointwise convergence of the copulas

 CMn(u)→n→∞CG(u),

where , , is the copula of . This is an extreme value copula. Note that each margin of is continuous, which is equivalent with the continuity of (see, e.g., Reiss (1989, Lemma 2.2.6)).

Elementary arguments imply that condition (3) is equivalent with the condition

 (4) CnF(1+yn)→n→∞G∗(y):=CG(exp(y)),y≤0∈Rd,

where and , , defines a max-stable df with standard negative exponential margins , , . Such a max-stable df will be called a standard one, abbreviated by SMS (standard max-stable).

While the condition on the univariate margins in Theorem 1.1 addresses univariate extreme value theory, condition (3) on the copula means by the equivalent condition (4) that the copula is in the domain of attraction of a multivariate SMS df:

 CnF(1+yn)=P(n(Mn−1)≤y)→n→∞G∗(y),y≤0∈Rd.

Let be an arbitrary copula on . Then condition (1) becomes

 C∈D(G)⟺Cn(anx+bn)→n→∞G(x),x∈Rd,

where the norming constants are determined by the univariate margins of , i.e., the uniform distribution on : With , we obtain for large

 Ci(anx+bn)n =(1+xn)n→n→∞exp(x),x≤0.

We therefore obtain the conclusion: If a copula satisfies , then the limiting df has necessarily standard negative exponential margins:

 Gi(x)=exp(x),x≤0,1≤i≤d,

i.e., the limiting df is necessarily a SMS df.

As a consequence we obtain that multivariate extreme value theory actually means extreme value theory for copulas.

This paper is organized as follows. In the next section we introduce -norms, which turn out to be a common thread in multivariate extreme value theory. Using the concept of -norms, we introduce in Section 3 generalized Pareto copulas (GPC). The characteristic property of a GPC is its excursion or exceedance stability, established in Theorem 4.1. The family of GPC together with the well-known set of univariate generalized Pareto distributions (GPD) enables the definition of multivariate GPD in Section 5. As the set of univariate GPD equals the set of univariate non degenerate exceedance stable distributions, its extension to higher dimensions via a GPC and GPD margins is an obvious idea. -neighborhoods of a GPC are introduced in Section 6. The normal copula is a prominent example. Among others we show how to simulate data, which follow a copula from such a -neighborhood. In Section 7 we show how our findings on GPC can be used to estimate exceedance probabilities above high thresholds, including confidence intervals. A case study in Section 8 on joint exceedance probabilities for air pollutants such as ozone, nitrogen dioxide, nitrogen oxide, sulphur dioxide and particulate matter, completes the paper.

## 2. Introducing D-Norms

A crucial characterization of SMS df due to Balkema and Resnick (1977), de Haan and Resnick (1977), Pickands (1981) and Vatan (1985) can be formulated as follows; see Falk et al. (2011, Section 4.4).

###### Theorem 2.1 (Balkema and Resnick (1977), de Haan and Resnick (1977), Pickands (1981), Vatan (1985)).

A df on is an SMS df iff there exists a norm on such that

 (5) G(x)=exp(−∥x∥),x≤0∈Rd.

Elementary arguments imply the following consequence.

###### Corollary 2.2.

A copula satisfies iff there exists a norm on such that

 (6) C(u)=1−∥1−u∥+o(∥1−u∥)

as , uniformly for .

Those norms, which can appear in the preceding result, can be characterized. Any norm in equation (5) or (6) is necessarily of the following kind: There exists a rv , whose components satisfy

 Zi≥0,E(Zi)=1,1≤i≤d,

with

 ∥x∥=E(max1≤i≤d(|xi|Zi))=:∥x∥D,

.

Such a norm is called -norm, with generator . The additional index means dependence. -norms were first mentioned in Falk et al. (2004, equation (4.25)) and more elaborated in Falk et al. (2011, Section 4.4). Examples are:

• , with generator ,

• , with generator being a random permutation of the vector ,

• each logistic norm , , with generator , iid Fréchet-distributed rv with parameter , where denotes the usual gamma function.

• Let the rv

follow a multivariate normal distribution with mean vector zero, i.e.,

, , and covariance matrix . Then

follows a log-normal distribution with mean

, , and, thus,

 Z=(Z1,…,Zd):=(exp(X1−σ112),…,exp(Xd−σdd2))

is the generator of a -norm, called Hüsler-Reiss -norm. This norm only depends on the covariance matrix and, therefore, it is denoted by .

The generator of a -norm is in general not uniquely determined, even its distribution is not. Take, for example, any rv with . Then generates the sup-norm . An account of the theory of -norms is provided by Falk (2019).

## 3. Generalized Pareto Copulas

Corollary 2.2 stimulates the following idea. Choose an arbitrary -norm on and put with

 C(u):=max(1−∥1−u∥D,0),u∈[0,1]d.

Each univariate margin of , defined this way, satisfies for

 Ci(u) =C(1,…,1,ui-th component,1…,1) =1−∥(0,…,0,1−u,0,…,0)∥D =1−(1−u)E(Zi)=1=u,

i.e., each is the uniform df on . But does in general not define a df, see, e.g., Falk et al. (2011, Proposition 5.1.3). We require, therefore, the expansion

 C(u)=1−∥1−u∥D

only for close to , i.e., for with some . A copula with this property will be called a generalized Pareto copula (GPC). These copulas were introduced in Aulbach et al. (2012); tests, whether data are generated by a copula in a -neighborhood of a GPC were derived in Aulbach et al. (2018), see Section 6 for the precise definition of this neighborhood. The multivariate generalized Pareto distributions defined in Section 5 show that GPC actually exist for any -norm . The corresponding construction of a generalized Pareto distributed rv also provides a way to simulate data from an arbitrary GPC.

As a consequence, an arbitrary copula satisfies the following equivalences

 C∈D(G) ⟺C(u)=1−∥1−u∥D+o(∥1−u∥)for some D-norm ∥⋅∥D ⟺C\ is in its upper tail close to that of a GPC.

In this case we have , .

###### Example 3.1.

Take an arbitrary Archimedean copula on

 Cφ(u)=φ−1(φ(u1)+⋯+φ(ud)),

where is a continuous and strictly decreasing function from to such that (see, e.g., McNeil and Nešlehová (2009, Theorem 2.2)). Suppose that

 (7) p:=−lims↓0sφ′(1−s)φ(1−s) exists % in [1,∞).

It follows from Charpentier and Segers (2009, Theorem 4.1) that is in its upper tail close to the GPC with corresponding logistic -norm .

Suppose that the generator function satisfies with some

 (8) −sφ′(1−s)φ(1−s)=p,s∈(0,s0],

with . Then is a GPC, precisely,

 Cφ(u)=1−∥1−u∥p=1−(d∑i=1|1−ui|p)1/p,u∈[1−s0,1]d.

This is readily seen as follows. Condition (8) is equivalent with the equation

 (log(φ(1−s)))′=ps,s∈(0,s0].

Integrating both sides implies

 log(φ(1−s))−log(φ(1−s0))=plog(s)−plog(s0)

or

 log(φ(1−s)φ(1−s0))=log((ss0)p),s∈(0,s0],

which implies

 φ(1−s)=φ(1−s0)sp0sp,s∈[0,s0],

i.e.,

 φ(s)=c(1−s)p,s∈[1−s0,1],

with . But this yields

 Cφ(u) =φ−1(φ(u1)+⋯+φ(ud)) =1−(d∑i=1(1−ui)p)1/p,u∈[1−s0,1]d.

## 4. Characterization of a GPC

Next we derive the characteristic property of a GPC. Suppose the rv follows a GPC . Then its survival function equals

 P(U≥u)=≀≀1−u≀≀D,u∈[u0,1]⊂Rd,

where

 ≀≀x≀≀D:=E(min1≤i≤d(|xi|Zi)),x∈Rd,

is the dual -norm function pertaining to with generator , see the proof of Theorem 4.1. Using the equations (4) below it is straightforward to prove that does not depend on the particular choice of the generator of . We have, for example,

 ≀≀x≀≀1=0,≀≀x≀≀∞=min1≤i≤d|xi|,x=(x1,…,xd)∈Rd.

Note that the mapping is not one-to-one, i.e., two different -norms can have identical dual -norm functions.

The function is obviously homogeneous:

 ≀≀tx≀≀D=t≀≀x≀≀D,t≥0.

As a consequence, a GPC is excursion stable:

 P(U≥1−tu∣U≥1−u)=≀≀tu≀≀D≀≀u≀≀D=t,t∈[0,1],

for close to , provided .

Note that each marginal distribution of a GPC is a lower dimensional GPC as well: If the rv follows the GPC on , then the rv follows a GPC on , for each nonempty subset . We have

 P((Ui1,…,Uim)≤v)=1−∥∥ ∥∥m∑j=1(1−vj)eij∥∥ ∥∥D,

for close to , where denotes the -th unit vector in , .

The characteristic property of a GPC is its excursion stability, as formulated in the next result.

###### Theorem 4.1.

Let the rv follow a copula . Then is a GPC iff for each nonempty subset of the rv is exceedance stable, i.e.,

 (9) P(UT≥1−tu)=tP(UT≥1−u),t∈[0,1],

for close to .

###### Proof.

The implication “” in the preceding result is just a reformulation of Falk and Guillou (2008, Proposition 6). The conclusion “” can be seen as follows. We can assume without loss of generality that .

Using induction, it is easy to see that arbitrary numbers satisfy the equations

By choosing , the preceding equations imply in particular

 (11) 1=∑∅≠T⊂{1,…,d}(−1)|T|−1.

The inclusion-exclusion principle implies for close to

 P(U≥1−v) =1−P(d⋃i=1{Ui≤1−vi}) =1−∑∅≠T⊂{1,…,d}(−1)|T|−1P(Ui≤1−vi,i∈T) =1−∑∅≠T⊂{1,…,d}(−1)|T|−1(1−∥∥ ∥∥∑i∈Tviei∥∥ ∥∥D)

Choose a generator of . From equation (4) we obtain

 ∑∅≠T⊂{1,…,d}(−1)|T|−1∥∥ ∥∥∑i∈Tviei∥∥ ∥∥D =E⎛⎝∑∅≠T⊂{1,…,d}(−1)|T|−1maxi∈T(viZi)⎞⎠ =E(min1≤i≤d(viZi))=≀≀v≀≀D.

Replacing by yields the assertion. ∎

If , then (9) clearly becomes

 P(UT≥1−tu∣UT≥1−u)=t,t∈[0,1].

But can be equal to zero for all close to . This is for example the case, when the underlying -norm is . Then , and, thus, for all close to , unless .

While the characteristic property of a GPC is its excursion stability, the characteristic property of an extreme value copula , , which corresponds to a max-stable df , is its max-stability, defined below. By transforming the univariate margins to the standard negative distribution, we can assume without loss of generality that is an SMS df. In this case we have , , and, thus, we obtain the representation of the copula of an arbitrary max-stable df

 (12) CG(u)=exp(−∥(log(u1),…,log(ud))∥D),u∈(0,1]d,

with some -norm . For a discussion of parametric families of extreme value copulas and their statistical analysis we refer to Genest and Nešlehová (2012).

Equation (12) obviously implies the max-stability of an extreme value copula :

 (13) CnG(u1/n)=CG(u),u∈(0,1]d,n∈N.

If, on the other hand, an arbitrary copula satisfies equation (13), then it is clearly the copula of a SMS df . As a consequence, we have two stabilities of copulas: max-stability and exceedance stability.

Let be an arbitrary copula on . The considerations in this section show that the copula of converges point-wise to a max-stable copula if, and only if, is in its upper tail close to that of an excursion stable copula, i.e., to that of a GPC.

The message of the considerations in this section is: If one wants to model the copula of multivariate exceedances above high thresholds, then a GPC is a first option.

## 5. Multivariate Generalized Pareto Distributions

Let be the set of univariate max-stable df as defined by the equations above and in (2). The family of univariate generalized Pareto distributions (GPD) is the family of univariate excursion stable distributions:

 Hα(x) :=1+log(Gα(x)),Gα(x)>exp(−1), =⎧⎨⎩1−(−x)α,−1≤x≤0, if α>0,1−xα,x≥1, if α<0,1−exp(−x),x≥0, if α=0.

Suppose the rv follows the df . Then

 P(V>tx∣V>x) =tαfor {t∈[0,1],−1≤x<0, if α>0,t≥1,x≥1, if α<0, P(V>x+t∣V>x) =exp(−t),for t≥0,x≥0, if α=0.

For a threshold and an , the univariate GPD takes the form of the following scale and shape family of distributions

 (14) H1/ξ((x−s)/σ)=1−(1+ξ(x−s)/σ)−1/ξ,

where and (e.g. Falk et al., 2011, page 35).

The definition of a multivariate GPD is, however, not unique in the literature. There are different approaches (Rootzén and Tajvidi (2006), Falk et al. (2011)), each one trying to catch the excursion stability of a multivariate rv. The following suggestion might conclude this debate. Clearly, the excursion stability of a rv should be satisfied by its margins and its copula. This is reflected in the following definition.

###### Definition 5.1.

A rv follows a multivariate GPD, if each component follows a univariate GPD (at least in its upper tail), and if the copula corresponding to is a GPC, i.e., there exists a -norm on and such that

 C(u)=1−∥1−u∥D,u∈[u0,1].

As a consequence, each such rv , which follows a multivariate GPD, is exceedance stable and vice versa.

###### Example 5.2.

The following construction extends the bivariate approach proposed by Buishand et al. (2008) to arbitrary dimension. It provides a rv, which follows an arbitrary multivariate GPD as in Definition 5.1. Let be the generator of a -norm , with the additional property that each , for some . Note that such a generator exists for an arbitrary -norm according to the normed generators theorem for -norms (Falk (2019)). Let the rv be uniformly on distributed and independent of . Put

 (15) V=(V1,…,Vd):=1U(Z1,…,Zd):=1UZ.

Then, for each ,

 P(1UZi≤x)=1−1x,x large,

i.e., follows in its upper tail a univariate standard Pareto distribution, and, by elementary computation, we have

 P(V≤x)=1−∥∥∥1x∥∥∥D,x large.

The preceding equation implies that the copula of is a GPC with corresponding -norm . The rv can be seen as a prototype of a rv, which follows a multivariate GPD. This GPD is commonly called simple.

Choose as in equation (15) and numbers . Then

 Y :=(Y1,…,Yd) :=(H−1α1(1−1V1),…,H−1αd(1−1Vd)) (16) =(H−1α1(1−UZ1),…,H−1αd(1−UZd))

follows a general multivariate GPD with margins in its univariate upper tails.

With the particular choice we obtain a standard multivariate GPD

 Y=−U(1Z1,…,1Zd).

Its df is

 P(Y≤x)=1−∥x∥D

for , close enough to zero.

With the particular choice we obtain a multivariate GPD with Gumbel margins in the upper tails

 Y=(log(Z1)−log(U),…,log(Zd)−log(U)),

where

follows the standard exponential distribution on

.

Up to a possible location and scale shift, each rv , which follows a multivariate GPD as defined in Definition 5.1, can in its upper tail be modeled by the rv in equation (5.2). This makes such rv in particular natural candidates for simulations of multivariate exceedances above high thresholds.

## 6. δ-Neighborhoods of GPC

A major problem with the construction in (15) is the additional boundedness condition on the generator . This is, for example, not given in case of the logistic -norm with or the Hüsler-Reiss -norm. From the normed generators theorem in Falk (2019) we know that bounded generators exist, but, to the best of our knowledge, they are unknown in both cases.

In this section we drop this boundedness condition and show that the construction (15) provides a copula, which is in a particular neighborhood of a GPC, called -neighborhood. We are going to define this neighborhood next.

Denote by the unit sphere in with respect to the norm , . Choose an arbitrary copula on and put for

 Ct:=C(1+st),s≤0.

Then is a univariate df on , and the copula is obviously determined by the family

 P(C):={Ct:t∈R}

of univariate spectral df . The family is the spectral decomposition of ; cf Falk et al. (2011, Section 5.4). A copula is, consequently, in iff its spectral decomposition satisfies

 Ct(s)=1+s∥t∥D+o(s),t∈R,

as . The copula is by definition in the -neighborhood of the GPC with -norm if their upper tails are close to one another, precisely, if

 1−Ct(s) =(1−CD,t(s))(1+O(|s|δ)) (17) =|s|∥t∥D(1+O(|s|δ))

as , uniformly for . In this case we know from Falk et al. (2011, Theorem 5.5.5) that

 (18)

Under additional differentiability conditions on with respect to , also the reverse implication holds; cf. Falk et al. (2011, Theorem 5.5.5). Therefore, the -neighborhood of a GPC, roughly, collects those copula with a polynomial rate of convergence for maxima.

Condition (6) can also be formulated in the following way:

 1−C(u) =(1−CD(u))(1+O(∥1−u∥δ)) =∥1−u∥D(1+O(∥1−u∥δ))

as , uniformly for , where is an arbitrary norm on .

###### Example 6.1.

Choose and put for

 FI(t,u):=E(d∑i=11(Ui>1−tui)∣∣d∑i=11(Ui>1−ui)>0).

With , this is the fragility index, introduced by Geluk et al. (2007) to measure the stability of the stochastic system . The system is called stable if is close to one, otherwise it is called fragile. The asymptotic distribution of , given , was investigated in Falk and Tichy (2011); Falk and Tichy (2012).

If follows a GPC with corresponding -norm , we obtain for close enough to zero

 FI(t,u) =d∑i=1P(Ui>1−tui)P(∑dj=11(Uj>1−uj)>0) =d∑i=1tui1−P(U≤1−u) =t∥u∥1∥u∥D.

Writing

 ∥u∥1∥u∥D=1∥∥∥u∥u∥1∥∥∥D

implies that there is a least favorable direction with

 ∥r0∥D=minr∈R∥r∥D.

A vector with , , maximizes the fragility index. For arbitrary and , , one obtains for example with constant entry and

 FI(t,u)=tdd1/p.

If follows a copula, which is in a -neighborhood of a GPC with -norm , then we obtain the representation

 FI(t,u)=t∥u1∥∥u∥D(1+O(∥u∥δ)),% for u→0∈Rd.

If we replace for example by , where , , is the standard Pareto df, then we obtain for the fragility index

 FI(t,x)=E(d∑i=11(Xi>t