Large-scale catastrophic events can have a major impact on physical infrastructure and society. Multivariate extreme value models help to capture the structure of such events and are used to extrapolate measures of combined risk beyond the range of the available data. Applications include river flooding (Katz et al., 2002; Keef et al., 2009a, b; Asadi et al., 2015), extreme rainfall (Coles and Tawn, 1996; Süveges and Davison, 2012; Huser and Davison, 2014), wave height and extreme sea surge (de Haan and de Ronde, 1998) and high concentrations of air pollutants (Heffernan and Tawn, 2004; Eastoe and Tawn, 2009). Many applications have also contributed to the improvement of risk assessment in finance (Poon et al., 2003; Hilal et al., 2011, 2014).
In the asymptotic dependence case, and the largest values of and can occur together, whereas in the asymptotic independence case, and these largest values cannot occur simultaneously. Many extreme value models can only capture asymptotic dependence, but this excludes important cases of asymptotic independence; for example, for all Gaussian copulas with correlation . In the case when , the key to determining the form of the extremal dependence is the rate at which the conditional probability in (1.1) tends to zero, given by Coles et al. (1999) as a complementary extremal dependence measure , which for Gaussian copulas gives .
Modelling approaches proposed by Ledford and Tawn (1997), Heffernan and Tawn (2004), Wadsworth et al. (2017) and Huser and Wadsworth (2018) cover both asymptotic dependence and asymptotic independence. In all cases, an asymptotic form for the joint tail is specified above a high threshold and for inference and extrapolation of extreme probabilities the approach relies on the assumption that the model holds exactly. To illustrate the implications of this assumption, consider the case of asymptotic dependence, in which for all above the selected threshold once the model and threshold are selected, so there is no scope to allow for a penultimate value for these probabilities that converges to as
. Similar issues arise when estimating.
The focus of this paper is to try to establish penultimate models that are appropriate in the most flexible existing model for multivariate extremes, that of conditional extremes introduced by Heffernan and Tawn (2004).
In our analysis of conditional extremes, we focus on the bivariate case to simplify the notation; extension to the general multivariate case is straightforward (Heffernan and Tawn, 2004)
. The conditional model was originally presented for marginally Gumbel distributed random vectors, butKeef et al. (2013) showed that formulation on the Laplace scale is simpler when positive or negative dependence is possible. Thus we first transform our variables to random variables with Laplace margins via the probability integral transform,
and similarly for , preserving the dependence structure through the copula, according to Sklar’s (1959) representation theorem.
The Heffernan and Tawn (2004) approach presupposes that there exist normalising functions and such that
is a non-degenerate distribution function with no mass at infinity. Under mild assumptions on the joint distribution of, Heffernan and Resnick (2007) show that the normalising functions are of the form
with and two slowly-varying functions, i.e., satisfying for any as . By considering a broad class of parametric copula models, Heffernan and Tawn derive parametric forms for and that yield a parsimonious model and cover a broad range of extremal dependence structures not described by models arising from the standard theory for multivariate extremes, namely
This corresponds to approximating the slowly-varying functions by and . Setting is equivalent to setting for any positive constant
, with the change in norming absorbed into the variance of. If and , then are asymptotically dependent with , and otherwise they are asymptotically independent. To form a statistical model, Heffernan and Tawn (2004) assume that limit (1.2) holds exactly above some finite with norming functions of the form (1.4).
Papastathopoulos and Tawn (2016) found inverted max-stable processes for which the norming functions of the form (1.4) are inadequate and more general functions and of the form (1.3) are required. It is natural therefore to question whether we can extend the functions and to give better finite approximations.
Our main contribution is to derive the subasymptotic behaviour of the conditional tail approach for three copulas that span diverse extremal dependence structures. Our objective is to identify a second-order behaviour across conditional copulae from which we can derive a general penultimate conditional model for extremes. The core value of the work is to suggest in Section 5 new penultimate forms for and that can be used to broaden the simple parametric family (1.4), and help reduce extrapolation bias. This is particularly important as small differences in the parameter estimates at finite levels can result in large differences in extreme risk measures.
Another aspect of the subasymptotic behaviour is the limiting conditional independence of with , given , as . At a subasymptotic level, the distribution of is exponential, but that of depends on . This subasymptotic distribution, , say, tends to as , so the independence property is lost in this penultimate model which might therefore enable it to provide a better fit at a lower threshold than limit models.
Such subasymptotic behaviour is also required when conducting simulation studies in order to assess the performance of methods to fit the conditional model, as the estimates of and for can misleadingly suggest a poor fit as it is and for that are being estimated.
Although the focus of this paper is the extremal dependence structure, we outline the established convergence of the margins to their limiting distributions in Section 2 to set a framework for our extremal dependence analysis. The rest of the paper is organised as follows. In Section 2, we review penultimate analyses in bivariate contexts. Section 3 introduces the framework used to study the penultimate behaviour of the conditional tail model. In Section 4, we consider three copulas, with proofs of results given in the Appendix. In Section 5
, we summarise our findings through penultimate parametric models which extend the Heffernan–Tawn class of norming functions.
2 Existing penultimate analyses
The founding theorem in the theory of univariate extreme values characterises the limiting distribution of the maximum of a series of independent and identically distributed univariate random variables , suitably normalised by sequences and , namely, as ,
with . In practice the limit distribution is assumed to be exact for large , by taking , with , and parameters to be estimated. This leads to the generalised extreme value distribution being proposed as a parametric model for inference and extrapolation based on block maxima (Coles, 2001).
Fisher and Tippett (1928) raised the question of the accuracy of this approximation in the Gaussian case, i.e., . Then is Gumbel, i.e., , but they showed that at finite levels is better approximated by a generalised extreme value distribution with shape parameter , a negative Weibull distribution.
The approximation of by in extreme value applications is of concern when the convergence of the former to the latter is particularly slow, as any inaccuracy in the parameter estimates amplifies and introduces bias in extrapolations.
The study of rates of convergence of towards zero has a long history. The first attempts to characterise this for any value of date to Gomes (1984, 1994) and unpublished work of Smith (1987). Assuming the existence of the density and its derivative , we define the reciprocal hazard function , and we know from the von Mises (1936) conditions that
so replacing by can greatly improve the rate of convergence.
To illustrate (2.2) and (2.3), consider the standard Gaussian case and . Mills’ ratio can be used to derive the approximation to the survival distribution function , from which we get the approximate reciprocal hazard function and its derivative . We also have (Leadbetter et al., 1983, p. 14), from which we can conclude that
The convergence rate (2.2) is using the limit shape parameter, improving to when is replaced by , as in (2.3). Here the penultimate shape parameter for finite , in agreement with the observation, due to Fisher and Tippett (1928), that a negative Weibull-type distribution yields better approximations at finite levels.
The first study that has discussed the penultimate properties of bivariate maxima was Bofinger and Bofinger (1965), who derive the penultimate correlation of componentwise maxima in bivariate Gaussian samples of sizes ; the limit correlation was shown to be zero by Sibuya (1960), corresponding to in (1.1). Bofinger (1970) extends this analysis to the bivariate gamma and Morgenstern (1956) distributions, and sheds light on the form of the penultimate correlation for small . This work has been further explained for identically distributed random variables and by Ledford and Tawn (1996), who derive a model for joint tails of these variables using penultimate properties of as tends to the marginal upper endpoint; this results in a model that smoothly connects perfect dependence and complete independence. In exponential margins, Wadsworth and Tawn (2013) and de Valk (2016) extended this model to allow for different decay rates along different rays emanating from the origin, with these rays corresponding to power relationships in Pareto margins. There is also very strong parallels with work on inference for the spectral measure of multivariate regular variation with second-order features influencing estimates (Resnick, 2007; Cai et al., 2011). Despite these developments, there are currently no penultimate results for conditional multivariate extremes to parallel the univariate penultimate theory.
3 Sub-asymptotic conditional extremes
The conditional limit (1.2) encapsulates the limit conditional independence of and the excesses for large . We first focus on the marginal limiting behaviour of . According to Heffernan and Resnick (2007), Resnick and Zeber (2014) and Wadsworth et al. (2017), the conditioning in (1.2) can be modified to
where , and the distribution are the same as in (1.2).
In our penultimate analysis, our goal is to characterise the behaviour of the remainder terms, defined by
using the notation of (1.4). Specifically, we consider the second-order normalisation for and , with
With these penultimate forms, we are able to refine the normalisation of in (3.1), yielding the subasymptotic conditional distribution
with as .
Heffernan and Tawn (2004) give the rate of convergence of the conditional distribution for data arising from various copula models in terms of the order of convergence towards zero, as , of
with on the Gumbel scale. We consider how much we can improve this when using the penultimate norming, by studying the rate of convergence to zero of
We also want to quantify the subasymptotic remainder, using
along the lines of (2.3) in the univariate context. In all cases we will present these rates on a scale that is invariant to the marginal choice, by converting to a return period , where .
We first consider the Gaussian copula, with correlation parameter , which has and , for , for which the convergence towards the limit (1.2) was reported by Heffernan and Tawn to be the slowest in the examples they considered, namely . Second, we consider another example with asymptotic independence; the inverted logistic dependence structure (a special case in the class of inverted max-stable distributions studied by Papastathopoulos and Tawn (2016)). This distribution has and where is the logistic parameter with corresponding to independence. This distribution has a faster convergence rate than the Gaussian copula, and we shall see that the subasymptotic can have finite support depending on the precise value of the dependence parameter of this copula. Heffernan and Tawn reported the rate to be in this case. Third, the max-stable copula with logistic dependence structure represents the case of asymptotic dependence, and , and is a situation where convergence is . Our penultimate forms for and reflect these different rates of convergence, with major, minor and no changes found relative to and respectively.
4.1 Gaussian distribution
have a bivariate standard normal distribution with correlationand let be its marginal transform to the Laplace scale,
and similarly for as a function of . The dependence structure of is an example where , provided , and , .
i.e., and .
Note that if and , then the results of Theorem 4.1 also hold for all , i.e., including , with the key change being that the variances of and no longer have the term.
The penultimate norming , can be used to assess the goodness-of-fit at a finite level. By replacing by the threshold in the log-term of (4.1), we derive a second-order approximation for of the form
Similarly, we derive a second-order approximation for , i.e.,
Figure 1 illustrates convergence of the second-order approximations for and towards their limits when , and for values of corresponding to the up to the
Laplace quantile. Convergence is very slow, so it makes sense to consider second-order approximations when measuring the adequacy of finite-sample estimates. In order to give an idea of the amount of data needed to reach such quantiles, we change the scale of the abscissa to the return period scale, using
with the Laplace distribution function, any quantile on the Laplace scale and the number of observations per year. Figure 1 shows that even with the equivalent of more than years of daily data, the location and scale parameters differ significantly from their asymptotic values.
4.2 Inverted logistic distribution
In this section, we consider the bivariate random vector with inverted logistic distribution and Laplace margins (Ledford and Tawn, 1997; Papastathopoulos and Tawn, 2016). Its joint survival distribution function is
where , , is the exponent measure function of the logistic distribution. Here, and for , with corresponding to complete dependence and corresponding to independence.
so there is no difference in the penultimate form for from .
When , has finite support
and as . When , has finite support, with first-order expansion
and as . When , has finite support, with first-order expansion
Figure 2 illustrates the convergence of to for , 2/3 and 3/4, and corresponding to 0.8, 0.9, 0.95 and 0.99 quantiles. It appears that the adequacy of the approximation depends very strongly on .
4.3 Logistic distribution
Let have a bivariate logistic distribution with Laplace margins,
In the following, we do not consider the case corresponding to the trivial situation of complete independence. The degree of asymptotic dependence is .
5 Penultimate model form
The three examples of copula studied in Section 4 all have very different extremal dependence features, yet all are in the following, rather general, class of penultimate forms for the norming functions
where and are from of the first order norming functions and respectively, , and , are functions that are slowly varying at . For statistical modelling purposes we need to be more precise about the slowly varying function, and as in past practice, e.g., Ledford and Tawn (1996), we fix these functions above some threshold to be constants, i.e.,
In practice this may still be rather over-parameterised, given how difficult second-order effects are to estimate, and so in practice it may be sufficient to fix . Choices like this have been used in penultimate modelling in univariate cases (Hall and Welsh, 1985).
Next consider the choice of the limit distribution . Although our examples have shown we can get improved penultimate forms for , using , these improvements do not always lead to faster convergence to the limit distribution for certain ranges of parameters of the underlying model. Furthermore, as the current approach is to use a non-parametric approach to estimate , it is suggested that this is continued and is assumed independent of .
In conclusion, based on a series of examples, we have suggested a class of penultimate models for the Heffernan and Tawn (2004) conditional extremes model. This new class of models improves convergence rates relative to the limit model, and thus is likely to lead to better statistical inference. The extension is parsimonious with, in its most simple form, the introduction of two additional parameters.
This research was partially supported by the Swiss National Science Foundation.
Appendix A Appendix
a.1 Proof of Theorem 4.1
The details of the proof for the asymptotic quantities , and can be found in Heffernan and Tawn (2004). We follow a similar path to derive the penultimate approximations and use Mill’s ratio to get a tail approximation for and on the normal and Laplace scales, respectively. Since the Laplace distribution is symmetric, we focus on the right tail:
for large and . In order to get a second-order approximation for , we define a small quantity and set ; plugging this back in (A.1), we get
and this yields
We want the conditional distribution of to be well-behaved in its upper tail when . We have . On the Laplace scale, is transformed to
with and norming functions to be determined, and a random variable with a fixed distribution non-degenerate at . We derive these penultimate norming functions by writing in (A.4), and using (A.3). We get the second-order approximation by first expanding
We can use this expression for in , in which we keep only the terms that are not functions of , as they are disconnected from , and we get
Expanding further, we arrive at
and cancellation of the leading term yields
or equivalently .
The penultimate scale function stems from the -terms in (A.5), namely
which we expand as
Substituting into (A.6) gives
We now compute the penultimate distribution by substituting the expression for in (A.6), writing , with
We now give the order of convergence of (3.6) with the penultimate approximations and that we derived. After a marginal transform in order to get , we get the rate of convergence for , namely , which improves on the version of with first-order approximation for the normalising functions, for which the rate of convergence is of order . Taking improves even more on , as its rate of convergence in (3.6) is of order .
a.2 Proof of Theorem 4.2
We start by computing the conditional survival distribution of for large and deriving a tail approximation to it. We have, for non-negative and ,
where the partial derivative of the exponent measure is
To ease the following developments, we examine the log-survival conditional probability