# Penultimate Analysis of the Conditional Multivariate Extremes Tail Model

Models for extreme values are generally derived from limit results, which are meant to be good enough approximations when applied to finite samples. Depending on the speed of convergence of the process underlying the data, these approximations may fail to represent subasymptotic features present in the data, and thus may introduce bias. The case of univariate maxima has been widely explored in the literature, a prominent example being the slow convergence to their Gumbel limit of Gaussian maxima, which are better approximated by a negative Weibull distribution at finite levels. In the context of subasymptotic multivariate extremes, research has only dealt with specific cases related to componentwise maxima and multivariate regular variation. This paper explores the conditional extremes model (Heffernan and Tawn, 2004) in order to shed light on its finite-sample behaviour and to reduce the bias of extrapolations beyond the range of the available data. We identify second-order features for different types of conditional copulas, and obtain results that echo those from the univariate context. These results suggest possible extensions of the conditional tail model, which will enable it to be fitted at less extreme thresholds.

There are no comments yet.

## Authors

• 1 publication
• 13 publications
• 12 publications
05/08/2018

### Extremal properties of the extended skew-normal distribution

The skew-normal and related families are flexible and asymmetric paramet...
04/27/2019

### Tail models and the statistical limit of accuracy in risk assessment

In risk management, tail risks are of crucial importance. The assessment...
12/05/2019

### Outlier detection and a tail-adjusted boxplot based on extreme value theory

Whether an extreme observation is an outlier or not, depends strongly on...
11/06/2021

### Parametric and nonparametric probability distribution estimators of sample maximum

This study concerns probability distribution estimation of sample maximu...
11/23/2018

### Selected Methods for non-Gaussian Data Analysis

The basic goal of computer engineering is the analysis of data. Such dat...
07/22/2019

### Multiple block sizes and overlapping blocks for multivariate time series extremes

Block maxima methods constitute a fundamental part of the statistical to...
10/14/2020

### An Extreme Value Bayesian Lasso for the Conditional Bulk and Tail

We introduce a novel regression model for the conditional bulk and condi...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Large-scale catastrophic events can have a major impact on physical infrastructure and society. Multivariate extreme value models help to capture the structure of such events and are used to extrapolate measures of combined risk beyond the range of the available data. Applications include river flooding (Katz et al., 2002; Keef et al., 2009a, b; Asadi et al., 2015), extreme rainfall (Coles and Tawn, 1996; Süveges and Davison, 2012; Huser and Davison, 2014), wave height and extreme sea surge (de Haan and de Ronde, 1998) and high concentrations of air pollutants (Heffernan and Tawn, 2004; Eastoe and Tawn, 2009). Many applications have also contributed to the improvement of risk assessment in finance (Poon et al., 2003; Hilal et al., 2011, 2014).

All these approaches can be characterised in terms of their ability to model probabilities in the joint tail. Specifically, for random variables

and with marginal distributions and , a basic summary of extremal dependence is (Coles et al., 1999)

 χ=limp→1Pr{F2(X2)>p∣F1(X1)>p}. (1.1)

In the asymptotic dependence case, and the largest values of and can occur together, whereas in the asymptotic independence case, and these largest values cannot occur simultaneously. Many extreme value models can only capture asymptotic dependence, but this excludes important cases of asymptotic independence; for example, for all Gaussian copulas with correlation . In the case when , the key to determining the form of the extremal dependence is the rate at which the conditional probability in (1.1) tends to zero, given by Coles et al. (1999) as a complementary extremal dependence measure , which for Gaussian copulas gives .

Modelling approaches proposed by Ledford and Tawn (1997), Heffernan and Tawn (2004), Wadsworth et al. (2017) and Huser and Wadsworth (2018) cover both asymptotic dependence and asymptotic independence. In all cases, an asymptotic form for the joint tail is specified above a high threshold and for inference and extrapolation of extreme probabilities the approach relies on the assumption that the model holds exactly. To illustrate the implications of this assumption, consider the case of asymptotic dependence, in which for all above the selected threshold once the model and threshold are selected, so there is no scope to allow for a penultimate value for these probabilities that converges to as

. Similar issues arise when estimating

.

The focus of this paper is to try to establish penultimate models that are appropriate in the most flexible existing model for multivariate extremes, that of conditional extremes introduced by Heffernan and Tawn (2004).

In our analysis of conditional extremes, we focus on the bivariate case to simplify the notation; extension to the general multivariate case is straightforward (Heffernan and Tawn, 2004)

. The conditional model was originally presented for marginally Gumbel distributed random vectors, but

Keef et al. (2013) showed that formulation on the Laplace scale is simpler when positive or negative dependence is possible. Thus we first transform our variables to random variables with Laplace margins via the probability integral transform,

 X=sign{1−2F1(X1)}log[1−{|1−2F1(X1)|}],

and similarly for , preserving the dependence structure through the copula, according to Sklar’s (1959) representation theorem.

The Heffernan and Tawn (2004) approach presupposes that there exist normalising functions and such that

 Pr{Y−a(X)b(X)≤z,X−u>x∣∣∣X>u}→H(z)exp(−x),u→∞, (1.2)

where

is a non-degenerate distribution function with no mass at infinity. Under mild assumptions on the joint distribution of

, Heffernan and Resnick (2007) show that the normalising functions are of the form

 a(x)=xLa(x),b(x)=xβLb(x), β<1, (1.3)

with and two slowly-varying functions, i.e., satisfying for any as . By considering a broad class of parametric copula models, Heffernan and Tawn derive parametric forms for and that yield a parsimonious model and cover a broad range of extremal dependence structures not described by models arising from the standard theory for multivariate extremes, namely

 a0(x) =αx,α∈[−1,1], (1.4) b0(x) =xβ,β∈(−∞,1).

This corresponds to approximating the slowly-varying functions by and . Setting is equivalent to setting for any positive constant

, with the change in norming absorbed into the variance of

. If and , then are asymptotically dependent with , and otherwise they are asymptotically independent. To form a statistical model, Heffernan and Tawn (2004) assume that limit (1.2) holds exactly above some finite with norming functions of the form (1.4).

Papastathopoulos and Tawn (2016) found inverted max-stable processes for which the norming functions of the form (1.4) are inadequate and more general functions and of the form (1.3) are required. It is natural therefore to question whether we can extend the functions and to give better finite approximations.

Our main contribution is to derive the subasymptotic behaviour of the conditional tail approach for three copulas that span diverse extremal dependence structures. Our objective is to identify a second-order behaviour across conditional copulae from which we can derive a general penultimate conditional model for extremes. The core value of the work is to suggest in Section 5 new penultimate forms for and that can be used to broaden the simple parametric family (1.4), and help reduce extrapolation bias. This is particularly important as small differences in the parameter estimates at finite levels can result in large differences in extreme risk measures.

Another aspect of the subasymptotic behaviour is the limiting conditional independence of with , given , as . At a subasymptotic level, the distribution of is exponential, but that of depends on . This subasymptotic distribution, , say, tends to as , so the independence property is lost in this penultimate model which might therefore enable it to provide a better fit at a lower threshold than limit models.

Such subasymptotic behaviour is also required when conducting simulation studies in order to assess the performance of methods to fit the conditional model, as the estimates of and for can misleadingly suggest a poor fit as it is and for that are being estimated.

Although the focus of this paper is the extremal dependence structure, we outline the established convergence of the margins to their limiting distributions in Section 2 to set a framework for our extremal dependence analysis. The rest of the paper is organised as follows. In Section 2, we review penultimate analyses in bivariate contexts. Section 3 introduces the framework used to study the penultimate behaviour of the conditional tail model. In Section 4, we consider three copulas, with proofs of results given in the Appendix. In Section 5

, we summarise our findings through penultimate parametric models which extend the Heffernan–Tawn class of norming functions.

## 2 Existing penultimate analyses

The founding theorem in the theory of univariate extreme values characterises the limiting distribution of the maximum of a series of independent and identically distributed univariate random variables , suitably normalised by sequences and , namely, as ,

 Pr(max{X1,…,Xn}−bnan≤x)=Fn(anx+bn)→Gξ(x)=exp{−(1+ξx)−1/ξ+},

with . In practice the limit distribution is assumed to be exact for large , by taking , with , and parameters to be estimated. This leads to the generalised extreme value distribution being proposed as a parametric model for inference and extrapolation based on block maxima (Coles, 2001).

Fisher and Tippett (1928) raised the question of the accuracy of this approximation in the Gaussian case, i.e., . Then is Gumbel, i.e., , but they showed that at finite levels is better approximated by a generalised extreme value distribution with shape parameter , a negative Weibull distribution.

The approximation of by in extreme value applications is of concern when the convergence of the former to the latter is particularly slow, as any inaccuracy in the parameter estimates amplifies and introduces bias in extrapolations.

The study of rates of convergence of towards zero has a long history. The first attempts to characterise this for any value of date to Gomes (1984, 1994) and unpublished work of Smith (1987). Assuming the existence of the density and its derivative , we define the reciprocal hazard function , and we know from the von Mises (1936) conditions that

 ξ=limx→xFh′(x), (2.1)

where is the upper support point of . From (2.1), it is natural to consider the sub-asymptotic shape parameter , with . With this sub-asymptotic , Gomes and Pestana (1987) and Gomes (1994) show that

 Fn(anx+bn)−Gξ(x)=O(ξn−ξ),n→∞, (2.2)

and give the structure of the remainder term on the right-hand side for a broad class of distribution functions , including (Anderson, 1971). For this class, Gomes (1994) also gives

 Fn(anx+bn)−Gξn(x)=O{(ξn−ξ)2},n→∞, (2.3)

so replacing by can greatly improve the rate of convergence.

To illustrate (2.2) and (2.3), consider the standard Gaussian case and . Mills’ ratio can be used to derive the approximation to the survival distribution function , from which we get the approximate reciprocal hazard function and its derivative . We also have (Leadbetter et al., 1983, p. 14), from which we can conclude that

 ξn=h′(bn)=−1/(2logn)+O{1/(logn)2}. (2.4)

The convergence rate (2.2) is using the limit shape parameter, improving to when is replaced by , as in (2.3). Here the penultimate shape parameter for finite , in agreement with the observation, due to Fisher and Tippett (1928), that a negative Weibull-type distribution yields better approximations at finite levels.

The first study that has discussed the penultimate properties of bivariate maxima was Bofinger and Bofinger (1965), who derive the penultimate correlation of componentwise maxima in bivariate Gaussian samples of sizes ; the limit correlation was shown to be zero by Sibuya (1960), corresponding to in (1.1). Bofinger (1970) extends this analysis to the bivariate gamma and Morgenstern (1956) distributions, and sheds light on the form of the penultimate correlation for small . This work has been further explained for identically distributed random variables and by Ledford and Tawn (1996), who derive a model for joint tails of these variables using penultimate properties of as tends to the marginal upper endpoint; this results in a model that smoothly connects perfect dependence and complete independence. In exponential margins, Wadsworth and Tawn (2013) and de Valk (2016) extended this model to allow for different decay rates along different rays emanating from the origin, with these rays corresponding to power relationships in Pareto margins. There is also very strong parallels with work on inference for the spectral measure of multivariate regular variation with second-order features influencing estimates (Resnick, 2007; Cai et al., 2011). Despite these developments, there are currently no penultimate results for conditional multivariate extremes to parallel the univariate penultimate theory.

## 3 Sub-asymptotic conditional extremes

The conditional limit (1.2) encapsulates the limit conditional independence of and the excesses for large . We first focus on the marginal limiting behaviour of . According to Heffernan and Resnick (2007), Resnick and Zeber (2014) and Wadsworth et al. (2017), the conditioning in (1.2) can be modified to

 limx→∞Pr{Y−a(x)b(x)≤z∣∣∣X=x}=H(z), (3.1)

where , and the distribution are the same as in (1.2).

In our penultimate analysis, our goal is to characterise the behaviour of the remainder terms, defined by

 a(x)−a0(x)∼ra(x),b(x)−b0(x)∼rb(x),x→∞,

using the notation of (1.4). Specifically, we consider the second-order normalisation for and , with

 a1(x) =a0(x)+ra(x), (3.2) b1(x) =b0(x)+rb(x).

With these penultimate forms, we are able to refine the normalisation of in (3.1), yielding the subasymptotic conditional distribution

 Pr{Y−a1(X)b1(X)≤z∣∣∣X=x}=Hx(z),x>u, (3.3)

with as .

Heffernan and Tawn (2004) give the rate of convergence of the conditional distribution for data arising from various copula models in terms of the order of convergence towards zero, as , of

 Pr{Y−a0(X)b0(X)≤z∣∣∣X=x}−H(z), (3.4)

with on the Gumbel scale. We consider how much we can improve this when using the penultimate norming, by studying the rate of convergence to zero of

 Pr{Y−a1(X)b1(X)≤z∣∣∣X=x}−H(z). (3.5)

We also want to quantify the subasymptotic remainder, using

 supx>u∣∣ ∣∣Pr{Y−a1(x)b1(x)≤z∣∣∣X=x}−Hx(z)∣∣ ∣∣, (3.6)

along the lines of (2.3) in the univariate context. In all cases we will present these rates on a scale that is invariant to the marginal choice, by converting to a return period , where .

## 4 Examples

We first consider the Gaussian copula, with correlation parameter , which has and , for , for which the convergence towards the limit (1.2) was reported by Heffernan and Tawn to be the slowest in the examples they considered, namely . Second, we consider another example with asymptotic independence; the inverted logistic dependence structure (a special case in the class of inverted max-stable distributions studied by Papastathopoulos and Tawn (2016)). This distribution has and where is the logistic parameter with corresponding to independence. This distribution has a faster convergence rate than the Gaussian copula, and we shall see that the subasymptotic can have finite support depending on the precise value of the dependence parameter of this copula. Heffernan and Tawn reported the rate to be in this case. Third, the max-stable copula with logistic dependence structure represents the case of asymptotic dependence, and , and is a situation where convergence is . Our penultimate forms for and reflect these different rates of convergence, with major, minor and no changes found relative to and respectively.

### 4.1 Gaussian distribution

Let

have a bivariate standard normal distribution with correlation

and let be its marginal transform to the Laplace scale,

 X={−log2{1−Φ(V)},V>0,log2Φ(V),V≤0,

and similarly for as a function of . The dependence structure of is an example where , provided , and , .

###### Theorem 4.1

For with a Gaussian dependence structure, we have the ultimate and penultimate normings (1.4) and (3.2) for , with large, are

 a0(x) =sign(ρ)ρ2x, b0(x) =x1/2, (4.1) a1(x) =sign(ρ)ρ2x+(1−ρ2)2logx, b1(x) =x1/2−1/(4x),

i.e., and .

The limit distribution in (3.1) is a centred Gaussian with variance , and the penultimate distribution (3.3) is

 Hx(z)∼N⎧⎨⎩0,2ρ2(1−ρ2)(1+logx2√2ρ2x)2⎫⎬⎭.

If we write , the rate of convergence to the limit distribution is using the ultimate norming in (3.4), which is not improved using the penultimate norming in (3.5). The subasymptotic remainder (3.6) is .

Note that if and , then the results of Theorem 4.1 also hold for all , i.e., including , with the key change being that the variances of and no longer have the term.

The penultimate norming , can be used to assess the goodness-of-fit at a finite level. By replacing by the threshold in the log-term of (4.1), we derive a second-order approximation for of the form

 α1=sign(ρ)ρ2+(1−ρ2)logu2u. (4.2)

Similarly, we derive a second-order approximation for , i.e.,

 β1=12−14u. (4.3)

Figure 1 illustrates convergence of the second-order approximations for and towards their limits when , and for values of corresponding to the up to the

Laplace quantile. Convergence is very slow, so it makes sense to consider second-order approximations when measuring the adequacy of finite-sample estimates. In order to give an idea of the amount of data needed to reach such quantiles, we change the scale of the abscissa to the return period scale, using

 11−FL(x)×1nY,

with the Laplace distribution function, any quantile on the Laplace scale and the number of observations per year. Figure 1 shows that even with the equivalent of more than years of daily data, the location and scale parameters differ significantly from their asymptotic values.

### 4.2 Inverted logistic distribution

In this section, we consider the bivariate random vector with inverted logistic distribution and Laplace margins (Ledford and Tawn, 1997; Papastathopoulos and Tawn, 2016). Its joint survival distribution function is

 Pr(X>x,Y>y)=exp⎡⎢ ⎢⎣−V⎧⎪ ⎪⎨⎪ ⎪⎩−1log(12e−x),−1log(12e−y)⎫⎪ ⎪⎬⎪ ⎪⎭⎤⎥ ⎥⎦,x,y>0,

where , , is the exponent measure function of the logistic distribution. Here, and for , with corresponding to complete dependence and corresponding to independence.

###### Theorem 4.2

Let have a bivariate inverted logistic distribution with dependence parameter and Laplace margins. Then the ultimate and penultimate normings (1.4) and (3.2) for , with large, are

 a0(x) ≡0, b0(x) =x1−γ, a1(x) ≡−log2, b1(x) =x1−γ,

so there is no difference in the penultimate form for from .

The limit distribution in (3.1) is Weibull, specifically , and the penultimate distribution in (3.3) is such that

 −log¯¯¯¯¯Hx(z)=γz1/γ+⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩(1−γ)(1−log2)xz1/γ−γ(1−γ)2xz2/γ,0<γ<2/3,1−log23xz3/2−19xz3−(log2)28x(4−13log23)z−3/2,γ=2/3,−(log2)26γ2(1−γ){6γ+(1−8γ)log2}x3γ−3z1/γ−3,2/3<γ<1. (4.4)

When , has finite support

 [0,zHx]=[0,{x1−γ+1−log2γ}γ]⟶R+,x→∞,

and as . When , has finite support, with first-order expansion

 [(12−13log2)1/3(log24)2/3x−1/3,91/3x2/3]→R+,x→∞,

and as . When , has finite support, with first-order expansion

 [(log2)2/3γ(1−γ6)1/3{6γ+(1−8γ)log2}1/3xγ−1,+∞)→R+,x→∞.

If we write , the rates of convergence to the limit distribution are using the ultimate norming in (3.4) and using the penultimate norming in (3.5). The subasymptotic remainder (3.6) behaves like

 O{(logn)α−2}, α∈(0,1/2], O{(logn)3α−3}, α∈(1/2,2/3), (4.5) O{(logn)−4/3}, α=2/3, O{(logn)−1}, α∈(2/3,1).

Figure 2 illustrates the convergence of to for , 2/3 and 3/4, and corresponding to 0.8, 0.9, 0.95 and 0.99 quantiles. It appears that the adequacy of the approximation depends very strongly on .

### 4.3 Logistic distribution

Let have a bivariate logistic distribution with Laplace margins,

 Pr(X≤x,Y≤y)=exp⎡⎢ ⎢⎣−V⎧⎪ ⎪⎨⎪ ⎪⎩−1log(1−12e−x),−1log(1−12e−y)⎫⎪ ⎪⎬⎪ ⎪⎭⎤⎥ ⎥⎦,x,y>0,

with

 V(z,w)=(z−1/γ+w−1/γ)γ,γ∈(0,1].

In the following, we do not consider the case corresponding to the trivial situation of complete independence. The degree of asymptotic dependence is .

###### Theorem 4.3

Let have a bivariate inverted logistic distribution with dependence parameter and Laplace margins. Then, the ultimate normings (1.4) for , with large, are

 a0(x)=x,b0(x)=1,

and penultimate normings (3.2) are identical to , .

## 5 Penultimate model form

The three examples of copula studied in Section 4 all have very different extremal dependence features, yet all are in the following, rather general, class of penultimate forms for the norming functions

 a1(x)=(α+La(x)xγa)x,logb1(x)=(β+Lb(x)xγb)logx

where and are from of the first order norming functions and respectively, , and , are functions that are slowly varying at . For statistical modelling purposes we need to be more precise about the slowly varying function, and as in past practice, e.g., Ledford and Tawn (1996), we fix these functions above some threshold to be constants, i.e.,

 La(x)=δa,Lb(x)=δb,x>u.

In practice this may still be rather over-parameterised, given how difficult second-order effects are to estimate, and so in practice it may be sufficient to fix . Choices like this have been used in penultimate modelling in univariate cases (Hall and Welsh, 1985).

Next consider the choice of the limit distribution . Although our examples have shown we can get improved penultimate forms for , using , these improvements do not always lead to faster convergence to the limit distribution for certain ranges of parameters of the underlying model. Furthermore, as the current approach is to use a non-parametric approach to estimate , it is suggested that this is continued and is assumed independent of .

In conclusion, based on a series of examples, we have suggested a class of penultimate models for the Heffernan and Tawn (2004) conditional extremes model. This new class of models improves convergence rates relative to the limit model, and thus is likely to lead to better statistical inference. The extension is parsimonious with, in its most simple form, the introduction of two additional parameters.

## Acknowledgements

This research was partially supported by the Swiss National Science Foundation.

## Appendix A Appendix

### a.1 Proof of Theorem 4.1

The details of the proof for the asymptotic quantities , and can be found in Heffernan and Tawn (2004). We follow a similar path to derive the penultimate approximations and use Mill’s ratio to get a tail approximation for and on the normal and Laplace scales, respectively. Since the Laplace distribution is symmetric, we focus on the right tail:

 x ∼−log{2φ(v)v}=−log2+logv+12log(2π)+12v2, (A.1) v ∼√2x,

for large and . In order to get a second-order approximation for , we define a small quantity and set ; plugging this back in (A.1), we get

 x ∼−log2+log{√2x(1+ε)}+12log(2π)+12{√2x(1+ε)}2 ∼−log2+12log(√2x)+12log(2π)+x+2xε, (A.2)

and this yields

 v=√2x+2log2−log(2x)−log(2π)2√2x+O{(logx)2x3/2}. (A.3)

We want the conditional distribution of to be well-behaved in its upper tail when . We have . On the Laplace scale, is transformed to

 Y=a1(x)+b1(x)Z, (A.4)

with and norming functions to be determined, and a random variable with a fixed distribution non-degenerate at . We derive these penultimate norming functions by writing in (A.4), and using (A.3). We get the second-order approximation by first expanding

 W∼ √2ρ2x(1+ε)+2b(x)Z−logπ+log{ρ2x(1+ε)+b(x)Z}2√2ρ2x(1+ε)+2b(x)Z (A.5) ∼ ρ√2x(1+ε){1+b(x)Z2ρ2x(1+ε)} −[logπ+log{ρ2x(1+ε)}+b(x)Zρ2x(1+ε)]12√2ρ2x(1+ε){1−b(x)Z2ρ2x(1+ε)}.

We can use this expression for in , in which we keep only the terms that are not functions of , as they are disconnected from , and we get

Expanding further, we arrive at

 ρ√2x(1+ε2)−log(ρ2x)+ε2√2ρ2x(1−ε2)−ρ√2x+ρlogx2√2x+O(x−1/2),

and cancellation of the leading term yields

 ε=(1−ρ2)logx2ρ2x, (A.6)

or equivalently .

The penultimate scale function stems from the -terms in (A.5), namely

 b(x)Zρ√2x(1+ε)+log{ρ2x(1+ε)}b(x)Z2{2ρ2x(1+ε)}3/2−b(x)Z{2ρ2x(1+ε)}3/2, (A.7)

which we expand as

 b(x) ∼ρ√2x(1+ε)[1+log{x(1+ε)}4ρ2x(1+ε)−12ρ2x(1+ε)]−1 ∼ρ√2x(1+ε2){1−logx4ρ2x(1−ε)} ∼ρ√2x−logx2ρ√2x+ρ√2xε2.

Substituting into (A.6) gives

 b(x)∼ρ√2x−logx2ρ√2x+(1−ρ2)logx2ρ√2x∼x1/2−1/(4x)=b1(x).

We now compute the penultimate distribution by substituting the expression for in (A.6), writing , with

 ρ√2x(1−ρ2)logx4ρ2x−log(ρ2x)2√2ρ2x−(1−ρ2)logx4ρ2x√2ρ2x+(1−ρ2)log(ρ2x)logx4√2ρ2x(2ρ2x)+ρlogx2√2x,

which equals

 −log(ρ)√2ρ2x−(1−ρ2)log(x)2(2ρ2x)3/2+(1−ρ2)(logx)24(2ρ2x)3/2+(1−ρ2)log(ρ)2(2ρ2x)3/2∼−log(ρ)√2ρ2x=A(x), (A.8)

and

 b(x)Z√2ρ2x{(1−ε/2)+log(ρ2x)+ε2√2ρ2x(1−ε)−1−3ε/22ρ2x},

which equals

 b(x)Z√2ρ2x{1−ε2+logx2√2ρ2x(1−ε)+ε2√2ρ2x+logρ√2ρ2x(1−ε)−12ρ2x+O(ε2)}=Z+Zlogx2√2ρ2x+Zlogρ√2ρ2x+Z×O(logxx). (A.9)

Taking the leading and penultimate terms in (A.8) and (A.9), which corresponds to setting , we find that is a centred normal distribution with variance

 2ρ2(1−ρ2)(1+logx2√2ρ2x)2,

i.e., it has larger variance than the asymptotic found in Heffernan and Tawn (2004). If we consider the antepenultimate terms in (A.8) and (A.9), we get as

 N⎧⎨⎩−log(ρ)√2ρ2x,2ρ2(1−ρ2)(1+logx+2logρ2√2ρ2x)2⎫⎬⎭.

We now give the order of convergence of (3.6) with the penultimate approximations and that we derived. After a marginal transform in order to get , we get the rate of convergence for , namely , which improves on the version of with first-order approximation for the normalising functions, for which the rate of convergence is of order . Taking improves even more on , as its rate of convergence in (3.6) is of order .

### a.2 Proof of Theorem 4.2

We start by computing the conditional survival distribution of for large and deriving a tail approximation to it. We have, for non-negative and ,

 Pr(Y>y ∣X=x) =2ex×∂Pr(X>x,Y>y)∂x =−2exp{x−V(1x+log2,1y+log2)}V1(1x+log2,1y+log2)(x+log2)−2,

where the partial derivative of the exponent measure is

 V1(x,y)=−(x−1/γ+y−1/γ)γ−1x−1/γ−1.

To ease the following developments, we examine the log-survival conditional probability

 logPr(Y>y∣ X=x) = +(γ−1)log[{x(1+x−1log2)}1/γ+{y(1+y−1log2)}1/γ] +1−γγlog{x(1+x−1log2)} ≈ log2+x−x[1+log2γx+1−γ2γ2(log2)2x2+(yx)1/γ{1+log2γy+1−γ2γ2(log2)2y2}]γ +