Local likelihood estimation of complex tail dependence structures in high dimensions, applied to U.S. precipitation extremes

In order to model the complex non-stationary dependence structure of precipitation extremes over the entire contiguous U.S., we propose a flexible local approach based on factor copula models. Our sub-asymptotic spatial modeling framework yields non-trivial tail dependence structures, with a weakening dependence strength as events become more extreme, a feature commonly observed with precipitation data but not accounted for in classical asymptotic extreme-value models. To estimate the local extremal behavior, we fit the proposed model in small regional neighborhoods to high threshold exceedances, under the assumption of local stationarity. This allows us to gain in flexibility, while making inference for such a large and complex dataset feasible. Adopting a local censored likelihood approach, inference is made on a fine spatial grid, and local estimation is performed taking advantage of distributed computing resources and of the embarrassingly parallel nature of this estimation procedure. The local model is efficiently fitted at all grid points, and uncertainty is measured using a block bootstrap procedure. An extensive simulation study shows that our approach is able to adequately capture complex, non-stationary dependencies, while our study of U.S. winter precipitation data reveals interesting differences in local tail structures over space, which has important implications on regional risk assessment of extreme precipitation events. A comparison between past and current data suggests that extremes in certain areas might be slightly wider in extent nowadays than during the first half of the twentieth century.



There are no comments yet.


page 3

page 5

page 18

page 22

page 25

page 26

page 27


Spatial dependence and space-time trend in extreme events

The statistical theory of extremes is extended to observations that are ...

Modeling Spatial Data with Cauchy Convolution Processes

We study the class of models for spatial data obtained from Cauchy convo...

Advances in Statistical Modeling of Spatial Extremes

The classical modeling of spatial extremes relies on asymptotic models (...

Modeling short-ranged dependence in block extrema with application to polar temperature data

The block maxima approach is an important method in univariate extreme v...

New Exploratory Tools for Extremal Dependence: Chi Networks and Annual Extremal Networks

Understanding dependence structure among extreme values plays an importa...

Asymmetric tail dependence modeling, with application to cryptocurrency market data

Since the inception of Bitcoin in 2008, cryptocurrencies have played an ...

Modelling non-stationary extremes of storm severity: a tale of two approaches

Models for extreme values accommodating non-stationarity have been amply...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Water-related extremes such as floods and droughts can heavily impact human life, affecting our society, economic stability, and environmental sustainability. In recent years, we have witnessed an acceleration of the water cycle in some areas of the globe, including an increase in the frequency and intensity of heavy precipitation, sadly illustrated with a number of unprecedented hurricane events that recently hit the Caribbean islands and the Southeastern United States (U.S.) in August–September, 2017. These climate changes have motivated the development of stochastic models for risk assessment and uncertainty quantification of extreme weather events. In the univariate context, the generalized extreme-value distribution with time-varying parameters has been used to study the effect of climate change on global precipitation annual maxima (Westra et al., 2013). More recently, Fischer and Knutti (2016) have adopted an empirical approach to study heavy rainfall intensification, validating the theory predicted by early generations of general circulation models. Similarly, Hoerling et al. (2016) analyzed trends in U.S. heavy precipitation data. Beyond the univariate context, several studies have focused on the spatial or spatio-temporal modeling of precipitation extremes within and across different catchments (Cooley et al., 2007; Thibaud et al., 2013; Huser and Davison, 2014; Opitz et al., 2018), leading to extreme river discharges (Asadi et al., 2015) or on the risk of heavy snowfall in mountainous areas (Blanchet and Davison, 2011). In these small regions, however, the spatial dependence structure is often assumed to be stationary. By contrast, in this paper we focus on modeling the complex, non-stationary, spatial dependence structure of U.S. winter precipitation extremes on a continental scale.

Assessing the behavior of extreme events over space entails many challenges. First, data are measured at a finite number of locations, but extrapolation is typically required at any other location of the study region. In our case, the study region is the contiguous U.S. and the monitoring stations at which data are collected are displayed in Figure 1.

Figure 1: Topographic map of the contiguous U.S. with the 1218 weather stations (dots). Red dots represent the stations chosen below to illustrate the results.

Second, by contrast with classical geostatistics where the estimation target is usually the (conditional) mean, statistics of extremes focuses on high (or low) quantiles, and extrapolation is not only required over space, but also into the joint upper (or lower) tail of the distribution. Third, the scarcity of extremes results in uncertainty inflation, especially when very high quantiles have to be estimated, and thus it is important to develop efficient inference methods that can potentially be applied in high dimensions. Finally, a recurrent problem with large heterogeneous regions, such as the contiguous U.S., is the modeling of spatial non-stationarity. This may concern both marginal distributions and the dependence structure. As pointed out by

Huser and Genton (2016)

, misspecification of the joint distribution of extremes may result in poor estimation of spatial risk measures. There is no universal consensus on how to handle non-stationarity; however, it is often both useful and realistic to assume some weak form of local stationarity. In this work, we fit a very flexible non-stationary, but locally stationary, spatial model to U.S. winter precipitation extremes, which contrasts with the quite rigid fully parametric model fitted by

Huser and Genton (2016), and the non-parametric approach adopted by Castro-Camilo and de Carvalho (2016), whereby a family of bivariate extreme-value dependence structures at different sites is linked through a smooth function of covariates, while neglecting spatial dependence.

The literature on spatial extremes is mostly divided into two mainstream approaches: the first approach defines extreme events as block (e.g., annual) maxima, which are typically modeled using max-stable processes (Westra and Sisson, 2011; Davison and Gholamrezaee, 2012; Huser and Genton, 2016); the latter can be regarded as the functional generalization of multivariate extreme-value distributions, and they arise as limits of properly renormalized maxima of random fields. The second approach defines extreme events as high threshold exceedances, which are usually modeled using generalized Pareto processes, i.e., the threshold counterpart of max-stable processes (Ferreira et al., 2014; Thibaud and Opitz, 2015; de Fondeville and Davison, 2016). Both max-stable and generalized Pareto processes are “ultimate” models, in the sense that they have an asymptotic characterization for block maxima and threshold exceedances, respectively. However, their tail dependence structure is fairly rigid: max-stable copulas are invariant to the operation of taking componentwise maxima, while generalized Pareto copulas are invariant to thresholding at higher levels. This lack of tail flexibility has recently motivated the development of “penultimate” spatial models for threshold exceedances (Wadsworth and Tawn, 2012; Opitz, 2016; Huser et al., 2017; Huser and Wadsworth, 2018), which, unlike ultimate models, can capture weakening extremal dependence strength on the way to their limiting generalized Pareto process. See also Huser et al. (2018) and Bopp et al. (2018) for the case of maxima. To illustrate this, Figure 2 shows, for a range of quantiles

, the conditional probability

Figure 2: Empirical (white) and model-based (green) estimates for as a function of the threshold for six selected pairs of stations in different states, sorted according to their distance: New York at km apart (top left), Illinois at km apart (top middle), Maine at km apart (top right), Louisiana at km apart (bottom left), Idaho at km apart (bottom middle), and California at km apart (bottom right). Confidence envelopes for the empirical estimates (dark blue) are based on 300 block bootstrap samples with monthly blocks. Model-based confidence envelopes are also provided (light blue). Details on the fitted model are given in Sections 2 and 5.

estimated for six selected pairs of stations at distance , where , , denotes the 5 day-cumulative winter precipitation random field defined over the contiguous U.S. (denoted ), and denotes the marginal distribution of . While the function in (1.1) appears to display different dynamics for each pair of stations, a generalized Pareto process always assumes that is constant in (Rootzén et al., 2018) for each lag , hence potentially leading to an overestimation of the dependence strength at higher quantiles for the pairs that have a decreasing function . In Section 2, we develop a non-stationary model capturing non-trivial tail dependence, i.e., allowing for a positive limit , while at the same time having a decreasing function with known limiting extreme-value dependence structure. Our non-stationary model builds upon the stationary exponential factor copula model proposed by Krupskii et al. (2018), which has attractive modeling features, and is computationally more convenient than the models proposed by Huser et al. (2017) and Huser and Wadsworth (2018). It fundamentally differs from the Laplace model of Opitz (2016), which strongly focuses on capturing tail independence () and the decay rate towards the limit, rather than tail dependence (), which is the type of extremal behavior suggested by our precipitation data at small scales (or at least, we have for quite high thresholds ); recall Figure 2.

Inference for extreme-value models is known to be particularly cumbersome. When max-stable processes are directly or indirectly involved, the likelihood function becomes excessively prohibitive to compute in moderate to large dimensions (Castruccio et al., 2016). This has lead to the use of less efficient composite likelihood approaches (Padoan et al., 2010; Westra and Sisson, 2011; Huser and Davison, 2013; Thibaud et al., 2013; Huser and Davison, 2014). Alternatively, threshold approaches lead to simpler, significantly less demanding likelihoods, although their routine application to high-dimensional datasets is hampered by the need to censor observations involving non-extreme data (i.e., below a high threshold), which entails expensive multidimensional integrals; see, e.g., Wadsworth and Tawn (2014), Thibaud and Opitz (2015), Huser et al. (2017) and Huser and Wadsworth (2018). Other censoring schemes have been investigated by Engelke et al. (2015) and Opitz (2016), but they may cause some bias (Huser et al., 2016). More recently, Morris et al. (2017)impute the censored, non-extreme, observations using a simulation-based algorithm within a Bayesian framework. In this work, although our full dataset comprises 1218 monitoring stations over the contiguous U.S., the dimensionality is considerably reduced by adopting a local estimation approach, making it possible to perform traditional censored likelihood inference in a robust and efficient way.

Our main methodological contributions can be summarized as follows: we extend the stationary exponential factor copula model by allowing the parameter to change smoothly with locations, and provide further theoretical results for it. We develop a local likelihood approach adapted to the context of extremes to fit the stationary exponential factor copula model locally, censoring non-extreme observations. Finally, we derive simplified expression for the terms involved in the local censored likelihood function in order to compute maximum likelihood estimates in a reasonable time, while avoiding numerical instabilities.

In Section 2, the stationary exponential factor copula model is presented, and its tail properties are studied. Section 3 discusses censored local likelihood inference based on high threshold exceedances. In Section 4, a simulation study is conducted based on a non-stationary factor copula model, in order to assess the performance of our approach in various non-stationary contexts. In Section 5, we apply our proposed model to study the dependence structure of heavy precipitation over the whole contiguous U.S., and we use it to estimate return periods of spatial extreme events. Section 6 concludes with some discussion.

2 Modeling spatial extremes using factor copulas

2.1 Copula models

A copula is a multivariate probability distribution with

margins. Copulas are used to describe the dependence between random variables, and may be used to link univariate marginal distributions to construct a joint distribution. Specifically, let

be a random vector with continuous marginals

, . The copula of is defined through the random vector (with standard uniform margins) as . Sklar (1959) showed that each multivariate distribution with continuous margins has a unique copula , which may be expressed as


this implies that can be written in terms of univariate marginal distributions chosen independently from the dependence structure between the variables. This result can be associated with a two-step approach for inference, where margins are treated separately from the dependence structure. Based on (2.1), several copula families have been proposed and applied in practice for the modeling of environmental data, the most common one being the Gaussian copula obtained by choosing the joint distribution

to be the standard multivariate Gaussian distribution

with correlation matrix . Other more flexible elliptical copulas may be derived similarly, such as the Student- copula, which is tail-dependent as opposed to the Gaussian copula. An alternative general family of copulas generating interesting tail dependence structures are factor copula models (Krupskii and Joe, 2015; Krupskii et al., 2018), in which a random and unobserved factor affects all measurements simultaneously. In Section 2.2, we describe the construction of the stationary exponential factor copula model proposed by Krupskii et al. (2018), which has appealing modeling and inference properties, and we then embed it in a more general non-stationary model in Section 4.1.

2.2 The stationary exponential factor copula model

Let , , be a standard Gaussian process with stationary correlation function (see Gneiting et al. (2006) for a review of correlation functions), and let

be an exponentially distributed random variable with rate parameter

, independent of , and which does not depend on the spatial location . The exponential factor copula model may be expressed in continuous space through the random process


The process in (2.2) is a Gaussian location mixture, i.e., a standard Gaussian process with a random (exponentially distributed), constant mean. In this sense, it has similarities with the Student- process, which can be viewed as a Gaussian scale

mixture (with standard deviation following a specific inverse gamma distribution). However, both models are not nested in each other, and their dependence structures have significant dissimilarities.

Although Model (2.2) may seem quite artificial, it is only used to generate a flexible upper tail dependence structure, which we then fit to precipitation extremes. In other words, we disregard the margins and only consider the copula associated to (2.2), which we fit to high threshold exceedances using a censored likelihood approach, reducing the contribution of small precipitation values; more details are given in Sections 3 and 5.

From (2.2), each configuration of sites yields a -variate copula as follows: let and , . The random vector

has a multivariate normal distribution with correlation matrix

that depends on the correlation function and the sites’ configuration, i.e., . The components of the random vector are , , where is independent of ; by integrating out , the joint distribution of may be expressed as


whilst its density is


where is the multivariate standard normal density with correlation matrix . In Appendix A we provide simpler and computationally efficient expressions to compute (2.3), (2.4), and partial derivatives of (2.3), avoiding the integral in . Applying (2.1), the resulting copula and its density may be written as


where , , and and denote the marginal distribution and density of the process, respectively. In particular, we can show that


where is the standard normal distribution function; see (A.3) in Appendix A.

To illustrate the tail flexibility of the stationary exponential factor copula model, Figure 3 displays the function defined in (1.1) as a function of the quantile level and the Euclidean distance , for different rate and range parameters, assuming an exponential correlation function , . While the range controls the correlation decay with distance, the rate determines the overall tail dependence strength and strongly impacts the value of and its limit at large distances, i.e., as .

Figure 3: Conditional probability as defined in (1.1) for the stationary exponential factor model with rate parameter and exponential correlation function , plotted with respect to the level at distance (top) and with respect to the distance for fixed (bottom). Left: fixed and . Right: fixed and . The dots in the top panels correspond to the limit , and the green horizontal lines in the bottom panels correspond to perfect independence.

In particular, because the random factor in (2.2) is common to all sites , spatial dependence in the process does not vanish as . Therefore, this stationary model may be useful for (replicated) spatial data collected on a local or regional scale, but it is generally not realistic on larger scales, such as the whole contiguous U.S. In Section 3, we develop a local estimation approach assuming that the stationary model (2.2) is only valid locally, and in Section 4.1 we discuss a flexible extension to generate non-stationary tail dependencies structures. As shown by Krupskii et al. (2018), the distribution of the random factor characterizes the tail properties of the copula (2.5); in particular, the stationary exponential factor copula model is tail-dependent (i.e., ) for fixed , and its upper-tail dependence structure is governed the Hüsler and Reiss (1989) copula, which has been widely used for multivariate and spatial extremes (Davison et al., 2012). Moreover, as , Model (2.2) boils down to the Gaussian copula, which is tail-independent (i.e., ). Conversely, as , the

process is perfectly dependent over space. Thus, the exponential factor copula model interpolates between tail independence as

and perfect dependence as , while capturing a wide range of non-trivial tail dependence structures for .

We now provide more detailed information on the sub-asymptotic behavior of Model (2.2), refining the description of its tail structure. The rate at which converges to , as , characterizes the flexibility of the process to capture sub-asymptotic extremal dependence. In practice, this is important, as the model will always be fitted at a finite threshold. Proposition 1 shows that the parameter regulates this rate of convergence; the smaller , the faster the convergence, and vice versa. We deferred the proof to Appendix B.

Proposition 1.

Consider Model (2.2) with rate . We have the expansion

where , and, for , , with such that in the limit as ,

The measure is linked to the bivariate extremal coefficient function  (Davison et al., 2012) of the associated limiting max-stable random field, defined by


The extremal coefficient takes values between 1 and 2, with 1 corresponding to complete dependence among locations and 2 to complete independence. For model (2.2), we have , where ; see Krupskii et al. (2018) and the Supplementary Material. The extremal coefficient encompasses the effects of all parameters involved in the model, i.e., , and we use it in our simulation experiments (Section 4) and the data application (Section 5) to study the performance of our model and to assess its ability to flexibly capture different levels of extremal dependence.

Other types of tail dependence structures can be obtained using alternative distributions for the random factor (see Krupskii et al., 2018), but we here restrict ourselves to the exponential factor copula model, which yields flexible tail structures and fast inference.

3 Local likelihood inference with partial censoring

The assumption of stationarity underlying (2.2) is unrealistic over large heterogeneous regions, such as the whole contiguous U.S., but it may be the starting point for more sophisticated models; in particular, the true precipitation data generating process might be approximately stationary in small regions, providing support for non-stationary, but locally stationary, models. We here assume that Model (2.2) provides a good approximation to the local tail dependence structure while stemming from a more complex global data generating process.

Local likelihood estimation for univariate threshold exceedances was proposed by Davison and Ramesh (2000), while Anderes and Stein (2011) investigated how such an approach may be applied in the spatial context based on a single realization from a Gaussian process. Risser and Calder (2015) used local likelihood to estimate the spatially-varying parameters of a non-stationary Gaussian process. Their model is simplified by modeling the locally-varying anisotropies using an approach similar to the discrete mixture kernel convolution of Higdon (1998). We here detail how to perform local likelihood estimation based on high threshold exceedances by adapting the hard-thresholding methodology of Anderes and Stein (2011) to the joint upper tail of the stationary copula model (2.2), under the assumption that it is valid in small regional neighborhoods. Unlike Anderes and Stein (2011), we assume that multiple replicates of the process are observed, with possibly arbitrary marginal distributions.

Since our focus is on the data’s tail dependence structure, we advocate a two-step semi-parametric estimation procedure, whereby margins are first estimated at each site separately using the empirical distribution function, and the copula model is then estimated locally in a second step by maximum likelihood, censoring low (i.e., non-extreme) values to prevent them from influencing the fit. Such a two-step approach is standard in the copula literature, and has been studied in depth; see, e.g., Genest et al. (1995) and Joe (2014), Chapter 5.

More complex parametric approaches are also possible to estimate margins in the first step. Although marginal estimation is not our primary target, we study the effect of the rank transformation on the estimation of the copula, by comparing it to an approach based on the generalized Pareto distribution, and the optimal scenario where exact marginal distributions are used. We have found that the use of the rank-based empirical distribution function provides a fast, robust, and reliable method, which does not affect much the subsequent estimation of the copula (see Section 2.4 of the Supplementary Material).

The first step of our proposed estimation procedure consists in transforming the observed data non-parametrically to the uniform scale. Let denote independent and identically distributed observations at the th monitoring station, with essentially arbitrary margins. Pseudo-uniform scores may be obtained using ranks as follows:

In the second step, the scores , , are treated as a perfect random sample from the distribution. To estimate the spatial copula structure, we then discretize the space (in our case, the whole contiguous U.S.) into a fine grid . For each grid point , we assume that the local stationary copula model (2.2) with parameters is valid in a small neighborhood around . In what follows, these regional neighborhoods will be determined by a small number, , of nearest stations from the site , so we will write . Obviously, the choice of neighborhood is important: the stationary model (2.2) might be a poor approximation for large neighborhoods (i.e., for large ), whereas model fitting might be cumbersome for small neighborhoods characterized by a small number

of stations. This bias-variance trade-off is tricky to deal with, and Section 

5 describes one possible approach to mitigate this issue.

To estimate the local tail dependence structure, we suggest using a censored likelihood approach, which is standard in statistics for spatial extremes, though it has never been applied to Model (2.2); see, e.g., Ledford and Tawn (1996), Thibaud et al. (2013), Huser and Davison (2014), Wadsworth and Tawn (2014), Thibaud and Opitz (2015) and Huser et al. (2017). Essentially, vectors with non-extreme observations (i.e., lower than a high threshold) are partially or fully censored to prevent these low values from affecting the estimation of the extremal dependence structure. More precisely, for each grid point with associated neighborhood , let , , denote the pseudo-uniform scores for each of the nearby stations in , and let , , be high marginal thresholds; in our application in Section 5, we take for all . Using the notation introduced in (2.3)–(2.6), the censored local log-likelihood may be expressed as


where and , , is the index set of all non-censored observations (i.e., all vector components are extreme), is the index set of all fully censored observations (i.e., none of the vector components are extreme) with , is the index set of all partially censored observations (i.e., some, but not all, vector components are extreme), is the index set of threshold exceedances for the th vector of observations, and denotes differentiation with respect to the variables indexed by the set . Numerical maximization of (3) yields the maximum likelihood estimates for location . In Appendix A, we provide simple expressions for , and , which involve the -variate Gaussian density and the multivariate Gaussian distribution in dimension and , respectively. When is large, the computation of the multivariate Gaussian distribution can significantly slow down the estimation procedure. However, thanks to our local approach, is typically quite small, and this allows to estimate the model at a reasonable computational cost. Furthermore, because the likelihood maximizations can be done independently at each grid point , we can easily take advantage of distributed computing resources to perform each fit in parallel.

Following Anderes and Stein (2011), it is possible to generalize (3) to obtain smoother parameter estimates over space. Specifically, we can instead maximize the weighted log-likelihood function defined as


where is a non-negative weight function, and , denotes the nested sequence of subsets comprising the first nearest neighbors of , with the convention that and . Choosing hard-thresholding weights with for all boils down to (3). However, smoother parameter estimates may be obtained by selecting soft-thresholding weights that smoothly decay to zero near the neighborhood boundaries, e.g., using the biweight function , for some bandwidth . Although this estimation approach seems quite appealing, it significantly increases the computational burden, since the weighted log-likelihood function (3.2) requires computing (3) times. In the Supplementary Material, we compare hard- and soft-thesholding weights in a simulation study, and do not notice any major difference in the results. This is due to the fact that, unlike Anderes and Stein (2011), multiple replicates are available to fit the model. For these reasons, and because the censored likelihood procedure is already quite intensive, we do not pursue this approach further in this paper. Subsequent simulation and real data experiments are all based on (3).

Remark that, as mentioned above, we assume that Model (2.2) provides a good approximation to the local tail dependence structure while stemming from a more complex global data generating process. To make sure that the local stationary model (2.2) can truly come from a well-defined global stochastic process, we describe in Section 4.1 one possible way to embed the model (2.2) into a non-stationary process, which we use in our simulations.

4 Simulation study

4.1 A non-stationary exponential factor copula model

We here briefly describe how we can extend the stationary exponential factor copula model in (2.2) to a non-stationary process that we use in our simulation study. Further details and asymptotic results are provided in Section 1 of the Supplementary Material. Our proposed model extension, equivalent to (2.2) on infinitesimal regions, is


where is a zero mean Gaussian process with non-stationary correlation function and is a standard exponentially distributed common factor, independent of . The rate parameter , , is assumed to be a smooth surface, which dictates different degrees of tail dependence in distinct regions. Although (4.1) may not realistically capture long-distance dependencies owing to the latent factor being constant over space, its spatially-varying parameters flexibly describe the local dependence structure; Model (4.1) is therefore useful to “think” about the results globally, while generating various forms of extremal dependence locally (or regionally).

In the literature, different non-stationary correlation functions have been proposed; see, e.g., Fuentes (2001), Nychka et al. (2002), Stein (2005), Paciorek and Schervish (2006), and Reich et al. (2011). Here, we focus on a non-stationary, locally isotropic, Matérn correlation function with constant smoothness parameter , constructed through the kernel convolution approach advocated by Paciorek and Schervish (2006); it is defined as


where , , is a smoothly varying range parameter, is the Gamma function and is the modified Bessel function of second kind of order . The stationary Matérn correlation function (obtained by setting for all ) has become popular because of its appealing properties (Stein, 1999). In particular, a Gaussian process with Matérn correlation function is times mean-square differentiable if and only if . For , it boils down to the exponential correlation function, which yields continuous but non-differentiable sample paths. As , sample paths are infinitely differentiable. The non-stationary correlation function defined in (4.2) is locally Matérn, and therefore it inherits these attractive properties. An extension of (4.2) allowing for varying degrees of smoothness over space has been proposed in the unpublished manuscript of Stein (2005) (see also Anderes and Stein2011), but in practice, estimating is cumbersome and conservative approaches are usually adopted. In our application in Section 5, is treated as constant over space.

4.2 Data-generating scenarios and simulation results

In this section, we study the flexibility of our copula model (4.1) to locally describe complex extremal dynamics across space, and we assess the performance of our local estimation approach based on (3) at capturing such non-stationary dependence structures. We also analyze the sensitivity of the parameter estimates to the neighborhood size.

Data are simulated on a grid in from the copula model stemming from (4.1) based on the non-stationary Matérn correlation function (4.2); independent replicates are generated. We choose three different levels of smoothness by fixing , and we consider three scenarios for the range and the rate parameters, representing various levels of non-stationarity in the tail behavior. The true parameter values and the bivariate extremal coefficient function (recall (2.7)) with respect to three different reference points are shown in Figure 4 for the weakly, mildly and strongly non-stationary scenarios.


Weakly non-stat.
Mildly non-stat. Strongly non-stat. Estimates (mild case)


Extremal coefficient

Figure 4: Columns 1–3: True spatially-varying log-rate ( row), range ( row), and extremal coefficient ( rows) for a fixed location . Columns 1–3 correspond to different levels of non-stationarity. Column 4: Mean surface of estimated parameters for the mildly stationary scenario, based on simulation experiments, using the local censored likelihood approach with thresholds for all , and nearest neighbors, as detailed in Section 3. The smoothness parameter is fixed to .

In all simulations, the smoothness parameter is held fixed, while the rate and range parameters are estimated on a grid using the local estimation approach with censoring thresholds for all , as described in Section 3. To compute performance metrics and to measure the uncertainty of estimated parameters, we replicate all simulation experiments times.

We first detail the results for the mildly non-stationary scenario (second column of Figure 4) with and choosing nearest neighbors, while results for all simulation scenarios are summarized in Table 1. The rightmost column of Figure 4 displays mean surfaces, computed over the independent experiments, of the estimated log-rate and range parameters, as well as the fitted bivariate extremal coefficient functions for the three different reference locations. By comparing the true surfaces to the mean estimates, the bias appears to be quite small overall, except perhaps at the boundaries of the study region, . This is a well-known drawback of local estimation approaches: because the neighborhoods are asymmetric near the boundaries, the bias is generally more severe, and this can also be noticed in our case, e.g., in regions where

. This issue is similar to the boundary problem in kernel density estimation occurring with positively or compactly supported densities; some strategies have been advocated to deal with it, such as taking asymmetric kernels near the boundaries, or performing local linear regression (see, e.g., 

Castro-Camilo et al., 2018), but we do not pursue this here. Apart from this minor boundary problem, our local estimation approach succeeds in capturing the complex non-stationary dependence dynamics over space.

Weakly non-stationary 0.09 0.59 0.03 0.01 0.03 0.00
Mildly non-stationary 0.09 0.69 0.06 0.02 0.04 0.00
Strongly non-stationary 0.17 0.89 0.15 0.05 0.22 0.02
Table 1: RMISEs for the rate and range parameters for each smoothness parameter , computed over 1000 replicates from the data-generating configurations discussed in Section 4.2.

To assess the variability of the parameter estimates, Figure 5 displays a functional boxplot for the range parameter , projected onto the -axis representing the only direction of variation in the true values, and a surface boxplot for the bivariate extremal coefficient for fixed . Functional and surface boxplots are the natural extensions of the classical boxplot to the case of functional data, and we refer to Sun and Genton (2012) and Genton et al. (2014) for their precise interpretation.

Figure 5: Functional boxplot (left) for the range , , plotted with respect to -coordinate , and surface boxplot (right) for the bivariate extremal coefficient , plotted as a function of location for fixed . In the surface boxplot, the dark and light blue surfaces represent the and quartiles (i.e., the “box” of the boxplot), respectively, while the green surface is the median. The surface boxplot’s “whiskers” are not displayed for better visualization. Both panels show the results for the mildly non-stationary case with and nearest neighbors in the local estimation approach.

As the true range parameter , , varies only with respect to , it is easier to visualize its estimated uncertainty. As expected, appears to be well estimated overall with higher uncertainty for larger values. The estimated median curve follows the true curve very closely, even near the boundaries and around , where the true curve is the steepest, while the functional inter-quartile range is fairly narrow for all values of . Moreover, the estimates for single runs, superimposed on the functional boxplot, suggest that the hard-thresholding approach produces reasonably smooth estimates, even with only about 25 threshold exceedances at each location. In our application in Section 5, we have roughly 100 exceedances per location. As for the extremal coefficient function, the surface boxplot suggests that it can also be rather well estimated with relatively low uncertainty. Table 1 reports the root mean integrated squared error (RMISE) for all the data generating configurations described in Section 4.2. The results are coherent with our intuition: the estimation is more difficult for higher levels of non-stationarity and rougher random fields (i.e., with smaller ). Overall, simulations confirm that our local censored estimation approach provides reasonable estimates of the true range and rate parameters while capturing their dynamics over space.

To assess the performance of our local likelihood approach as a function of the neighborhood size, we fix the rate parameter to the strongly non-stationary scenario, while considering the weakly, mildly and strongly non-stationary cases for the range parameter (recall Figure 4). We then fit the copula model using the local likelihood approach described in Section 3 with neighborhoods defined in terms of nearest neighbors. Table 2 reports the RMISE for all cases. While the RMISE is quite small and fairly constant overall for the range , the rate seems more difficult to estimate, and it improves with lower degrees of non-stationarity for and bigger neighborhoods (i.e., with larger ), even in the strongly non-stationary scenario. This suggests that the size of neighborhoods will likely be dictated by available computational resources. Unless the non-stationarity is extremely severe, it is advisable to consider large neighborhoods, as this would improve the estimation efficiency at a fairly moderate cost in bias.

Configuration for the
range parameter
Weakly non-stationary 1.16 0.02 0.44 0.01 0.25 0.00 0.18 0.00 0.15 0.00
Mildly non-stationary 1.33 0.03 0.61 0.01 0.44 0.01 0.34 0.01 0.3 0.01
Strongly non-stationary 1.38 0.03 0.69 0.01 0.55 0.02 0.47 0.02 0.45 0.02
Table 2: RMISEs for the rate and range parameters for , as a function of the number of nearest neighbors used in the local estimation approach. The rate was kept fixed to the strongly non-stationary case, while different degrees of non-stationarity are considered for the range .

5 Case study: U.S. winter precipitation extremes

5.1 Data description

Daily precipitation data, freely available online, were gathered from the U.S. Historical Climatological Network (USHCN). They are measured in hundredths of an inch and were collected from to at the stations represented in Figure 1. To ensure data quality, we discard all observations marked by any reliability or accuracy flag. We focus on winter data (December 21 to March 20) to remove seasonal effects, and we consider five day-cumulative precipitation, in order to capture the intensity and duration of distinct storms and to reduce the effect of temporal dependence. A preprocessing stage was conducted to check for possible temporal non-stationarity, and no evidence of non-stationarity was found (see Section 2.1 of the Supplementary Material). This procedure yields up to observations per station with of missing data overall, which corresponds to about 2 million observations in total at all sites. The resulting dataset ranges from (no rain over five days) to hundredths of an inch.

Figure 6: Empirical (left) and (right) quantiles of five day-cumulative winter precipitation data, observed at each of the monitoring stations, plotted on the same logarithmic color scale.

The empirical and quantiles, plotted in Figure 6 using a logarithmic scale, reveal interesting spatial patterns that are due to marginal distributions varying smoothly over space. In order to disentangle the local marginal and dependence effects, we use the (two-step) local censored likelihood approach described in Section 3 and fit the exponential factor copula model; recall Sections 2 and 3.

5.2 Estimation grid and neighborhood selection

To describe the local dependence structure of precipitation extremes over the U.S., we generate a regular grid (using the WGS84/UTM zone 14N metric coordinate system) with grid points at an internodal distance of km. When plotted with respect to longitude and latitude, this results in a “distorted” grid, owing to the metric-to-degree system change.

An important step to fit our model for extremal dependence at each grid point using the local estimation approach described in Section 3 is to select a suitable number of nearest neighbors , which might vary over space. Although cross-validation techniques would usually be advisable, pragmatic approaches are often adopted in practice: in the time series context, Davison and Ramesh (2000) suggest selecting the local likelihood bandwidth by the naked eye, while in the spatial context, Anderes and Stein (2011)

advocate a heuristic method based on a measure of spatial variation in the estimated parameters. In principle, the choice of

should be such that the spatial dependence structure of threshold exceedances is approximately stationary within each selected neighborhood around . Small neighborhoods (with small ) yield good stationary approximations, but poor statistical efficiency, and vice versa, and our simulation study in Section 4 suggests that should be as large as our computational resources permit, provided the dependence structure is not overly non-stationary. In the hydrological literature, a variety of tests to assess homogeneity of marginal distributions have been proposed; see for instance Lu and Stedinger (1992), Fill and Stedinger (1995), Hosking and Wallis (1993), and Hosking and Wallis (2005). Testing for stationarity of the extremal dependence structure is, however, much more complicated. Here, for simplicity, we assume that the local dependence structure of extremes is stationary whenever this is the case for margins, and we follow the recommendations of Viglione et al. (2007) by testing the homogeneity of the margins through a compromise between the Hosking and Wallis (1993) test and a modified Anderson–Darling test (Scholz and Stephens, 1987)

. Our ad-hoc neighborhood selection procedure can, therefore, be summarized as follows: considering an increasing nested sequence of neighborhoods, we test for marginal homogeneity until the test rejects the null hypothesis or a predefined maximum neighborhood size has been reached. In our case, we fix the maximum neighborhood to have radius

km and, for computational reasons, . Although this procedure has no theoretical guarantee to be optimal, we have found that it yields reasonable estimates with our dataset.

5.3 Local likelihood inference for extreme precipitation

Following Section 3, we fit the stationary exponential factor copula model (2.2) within small local neighborhoods, by maximizing the censored local log-likelihood (3) at all 2200 grid points , choosing the empirical quantile as a threshold to define extreme events. The left-hand panel of Figure 6 illustrates the selected threshold at each monitoring station on the scale of the observations. As the smoothness parameter in (4.2) is difficult to estimate, we adopt a profile likelihood approach by considering a grid of values for . Among all grid points, the likelihood is maximized in , , , and cases for , respectively. To be able to easily compare rate and scale parameter estimates across space, we then choose to fix at all locations, which boils down to using an exponential correlation function for the underlying Gaussian process. It is important to notice that we are not trying to estimate the smoothness parameter per se. Rather, the aim of the profile likelihood approach and the selection via frequencies is to select (or validate) in a non-arbitrary data-driven way, a reasonable value for . In general, estimation of the smoothness parameter is difficult; with a single realization in a fixed domain, not all the Matérn correlation parameters can be estimated consistently (see Zhang2004). Moreover, when is estimated jointly with and , we have found that the estimated parameters become very difficult to identify.

Thanks to the embarrassingly parallel nature of our local likelihood estimation procedure, we can make an efficient use of distributed computing resources by fitting the local model at each grid point independently; overall, a single model fit (fixing , but estimating and ) at all locations took about 25 thousand core-hours on a cluster with 39 nodes of 20 cores each.

Figure 7 shows the estimated log-rate and log-range parameters, and their estimated standard deviations computed over 300 block bootstrap samples with monthly blocks. Diverse spatial tail dependence structures arise in different regions, although the correlation between the estimated range and rate parameters makes it difficult to interpret.

Figure 7: Estimated log-rate (top left) and log-range (bottom left) parameters, with their standard deviations (right) computed from 300 block bootstrap samples with monthly blocks, for the model (4.1) fitted to the data described in Section 5.1.

To better understand the dynamics of the extremal dependence strength over space, we compute the local extremal coefficient as in (2.7). Since our model is fitted locally, interpretation of the extremal coefficient is possible within the selected neighborhoods. As can be seen in Figure 2 in the Supplementary Material, the minimum neighborhood radius is 45 km; therefore we compute the extremal coefficient at distance km; see Figure 8. As expected, tail dependence weakens at larger distances, and the patterns are quite smoothly varying over space. Extremal dependence is stronger in the Pacific West, central region (except for Colorado), mid-South, mid-West and North East. By contrast, some states in the South East (e.g., North/South Carolina and Florida) and the Rocky Mountains show less extremal dependence, suggesting that extreme precipitation events are more localized in these regions.

Figure 8: Left: Estimates of the local bivariate extremal coefficient at distance km (top to bottom). Right: Associated standard deviations.

To further validate our fitted model, Figure 2 compares empirical and fitted conditional tail probabilities, as defined in (1.1). The scarcity of observations in the right tail makes the empirical estimates of highly variable, which in turn yields wide empirical confidence envelopes (in dark blue). Nevertheless, the pointwise model-based estimates (in green) and the 95% confidence envelope (in light blue) suggest that our model succeeds in capturing the extremal dependence between close-by sites, with relatively low uncertainty. Overall, our model has great tail flexibility, and our local estimation approach can uncover complex non-stationary patterns of extremal dependence over space, which has important implications in terms of regional risk assessment of extreme precipitation events.

It is well-known that small changes in the joint tail dependence structure might result in an important difference in the return period of spatial extremes. To assess this, we compute the return periods (in years) associated with the joint probability of observing extreme precipitation in a set of three locations within five different states. The probabilities were computed by simulating replicates from our copula model over the same 60km internodal distance grid detailed before in Section 5.2, and extremes were defined as those values that exceeded the 99.5% quantile at each grid point (i.e., ). Table 3

displays the selected thresholds on the scale of the data for Louisiana, Mississippi, Kentucky, Florida and Tennessee, notorious to have experienced several extreme precipitation events in the recent past. Pointwise estimates along with bootstrap-based 95% confidence intervals show that the selected locations in the state of Louisiana are at higher risk of simultaneous extreme events than those in the other four states, and should experience such spatial extreme events once every 11 months, on average. These results, of course, strongly depend on the relative positions of the selected locations. Note that the uncertainty, which accounts for both marginal and dependence estimation uncertainty, is moderately low.

State Thresholds (0.01 inch) Return period (year)
Louisiana {590, 730, 650} 0.9 (0.88, 0.93)
Mississippi {520, 470, 640} 12.1 (11.07 13.17)
Kentucky {750, 870, 480} 22.7 (20.11 25.25)
Florida {550, 1090, 550} 2.3 (2.26 2.44)
Tennessee {910, 690, 780} 8.6 (7.98 9.14)
Table 3: Pointwise estimates and 95% confidence intervals for the return periods (in years) associated with the joint probability of observing an extreme precipitation event exceeding the quantile at three locations simultaneously. The column reports the state of these locations, while the column represents the thresholds (in hundredths of inches) at each of the three selected locations.

6 Concluding remarks

We propose a flexible non-stationary factor copula model to model sub-asymptotic spatial extremes over large heterogeneous regions. Fitted locally, our model can capture weakening dependence strength as events become more extreme. This contrasts with the current spatial extremes literature, which often relies on asymptotic models with a rigid tail structure. Our model can be efficiently fitted to high threshold exceedances using a local censored likelihood estimation procedure at a reasonable computational burden. Our extensive simulation experiments demonstrate the flexibility of our model and the efficiency of our local estimation approach while providing some guidance on the selection of regional neighborhoods.

By fitting our model to precipitation data over the whole contiguous U.S., a diverse and complex tail dependence structure emerges, revealing rich and intuitive spatial patterns. Moreover, our model can be used in practice to compute return periods associated with simultaneous extreme events.

Although our proposed non-stationary model has great additional flexibility over its stationary counterpart, it is still unable to capture weak dependence at large distances, which is, however, a common feature of most models for spatial extremes. This implies that the fitted model needs to be interpreted locally, or perhaps regionally. Further research should, therefore, be devoted to developing models for spatial and spatio-temporal extremes that can more realistically capture long-range independence. Morris et al. (2017) discuss one way to achieve this based on random partitions of the study region.


We thank Luigi Lombardo (KAUST) for cartographic support and Eduardo González (KAUST) for computational support. We extend our thanks to Dan Cooley (Colorado State University) for helpful comments and suggestions. Support from the KAUST Supercomputing Laboratory and access to Shaheen is also gratefully acknowledged.

Appendix A Simplified expressions for the model likelihood

In this appendix, we obtain simplified expressions for the local censored likelihood function (3) based on Model (2.2), in order to compute the maximum likelihood estimates in a reasonable time, while avoiding numerical instabilities.

a.1 Joint density of the process in (2.2)

From expression (2.4), we have

where , , , , , . The marginal density with can be easily deduced.

a.2 Joint distribution of the process in (2.2)

Here we express the finite-dimensional distribution as a function of conditional multivariate normal distributions. This alternative formulation allows us to use efficient routines implemented in R, substantially improving accuracy and execution time. Using integration by parts in expression (2.3), we obtain

where and and are the conditional mean and covariance matrix of , respectively. Precisely,

where denotes the th column of the matrix with the th row removed, etc. Now,


where follows the standard multivariate normal distribution in dimension , and is independent of the univariate normal random variable with mean and unit variance, and where . To compute (A.1), notice that the -dimensional vector is jointly multivariate normal, and that has mean and covariance matrix . Hence, we have


Finally, we obtain


In particular, by setting , the marginal distribution may be written as


a.3 Partial derivatives of the joint distribution

We want to compute , the partial derivative of with respect to the variables indexed by the set . Without loss of generality, we here assume that , for some integer . Writing , we can express the covariance matrix in block notation as follows:

Then, writing , with and , we have