1 Introduction
1.1 Changing distribution of surface air temperature across time
The probability distribution of surface air temperature (SAT) possesses nonstationary traits, such as seasonality and longterm trends, that can be difficult to capture with offtheshelf models. This fact is particularly true for the tails of the distribution. In this paper, we study SAT using a model for the entire probability distribution that has versatile behavior in both tails. Some areas where such a model is useful include finance, insurance, and environmental science (Finkenstädt and Rootzén, 2003; Reiss and Thomas, 2007). The tails of the distribution may be more important than the bulk, as unlikely events in the stock market or environment can have more severe consequences than an ordinary event. With temperature data, both tails of the model distribution are important and of particular public interest due to growing impacts from climate change on human health, the environment, and the economy. Early in 2021, Texas experienced record low temperatures in February from winter storms, which led to massive power grid failures (Busby et al., 2021), while in the Pacific Northwest of America, record high temperatures in June led to increased hospitalizations and fatalities (Schramm et al., 2021; Philip et al., 2021). On longer time scales (months to years), both tails of precipitation distributions matter for the same reasons as temperature. Even when the focus is on the extremes, an understanding of the full distribution is often still of interest. Temporal changes may be different in the upper and lower tails, and nonstationary patterns such as seasonality and longterm trends will also be observed in the bulk of the distribution. To our knowledge, this study is the first attempt to model nonstationary temperature data on the daily scale with a concentration on behavior in both tails.
Temporal changes in temperature distribution are influenced by various phenomena. Seasonal variability is largely driven by the seasonal cycle in solar radiation, which depends strongly on latitude. However, largescale oceanic and atmospheric circulation patterns (McKinnon et al., 2013) and local geography (e.g., elevation, distance to the ocean) also have a large impact on the seasonal patterns at specific locations. For monthly means, temperature seasonality has been traditionally studied through circular harmonics, which have been shown to capture a large amount of the variability (up to ) (Legates and Willmott, 1990). However, at shorter time resolution, temperature exhibits more complex seasonal patterns.
Rising global mean temperature has drawn attention for many decades, and regional temperatures exhibit various rates and patterns of change, including in their extremes. A large literature has been generated on the topic; an extensive consolidation can be found in assessment reports of the Intergovernmental Panel on Climate Change (IPCC, 2021). In the following, we discuss literature focusing on statistical aspects of SAT extremes.
Most previous works have focused on statistics of a single tail of a temperature distribution but rarely on the entire distribution of temperature changing over time (Seneviratne et al., 2021). For instance, among the large quantity of work on global mean temperature, Rahmstorf and Coumou (2011) and Poppick et al. (2017) examined trends, and McKinnon et al. (2016)
studied longterm changes of temperature quantiles.
Meehl et al. (2009) showed that ratios of frequencies of daily hot maxima over frequencies of daily cold minima under nonstationary climate conditions in the United States exhibit strong asymmetry leading to more frequent hot extremes. Other studies have considered the evaluation and quantification of changing temperature in historical datasets (measurements and/or model output) (Legates and Willmott, 1990; Tarleton and Katz, 1995; Hansen et al., 2010; Rhines et al., 2017), while recent studies focus on projections under various greenhouse gas concentration pathways by leveraging the use of multimodel ensembles or large singlemodel ensembles (Huang et al., 2016; Haugen et al., 2018; Wehner, 2020). Several works have investigated the statistical aspects of extreme quantification under nonstationarity (Cheng et al., 2014; Gilleland et al., 2017; IPCC, 2021). Katz and Brown (1992)proposed a statistical hypothesis test to analyse the sensitivity of extreme events to changes in the location and shape of climate distributions, emphasizing the impact of the scale parameter on extreme occurrences.
Robin and Ribes (2020) proposed a nonstationary framework for extremes including model data and observations. Grotjahn et al. (2016) reviewed statistical methodologies, dynamics, modeling efforts, and trends dedicated to temperature extremes.Very few works have focused on jointly learning from the bulk and tails of SAT. However, pivotal insights about changing climate have been obtained by quantifying the link between changing extremes and other statistics of the bulk for the quantity of interest. For instance, Huybers et al. (2014) and Huang et al. (2016) respectively linked changing extremes to changing mean temperature using reanalysis data and climate model projections. Huybers et al. (2014) proposed a metric to link the mean to an extreme quantile. These works motivate the use of statistical models that can simultaneously assess changes in the tails and bulk of a distribution.
1.2 Statistical models for bulk and tails
A desirable property in a statistical model for a probability distribution is the ability to simultaneously make inferences about the bulk of the distribution and its upper and lower tails. A bulkandtails model has the appealing ability to produce simulations from an entire distribution with flexible behaviors in both tails, which could be useful in the context of stochastic weather generators for extremes (Semenov, 2008; Furrer and Katz, 2008).
Classical theory for extreme value statistics deals with a small fraction of the most extreme outcomes (large and/or small depending on context) of a sequence of independent, identically distributed (or stationary) random variables. Two common models, the generalized Pareto distribution (GPD) for threshold exceedances and the generalized extreme value distribution (GEV) for blockmaxima, are motivated by asymptotic theory and can be readily fit by practitioners with preexisting software. The restrictive assumption of stationarity can be addressed by treating parameters of the GPD or GEV distribution as functions of timedependent covariates
(Davison and Smith, 1990). See Coles (2001) for a more extensive introduction to extreme value theory.An essential limitation of GPD and GEV models is that they only fit a single tail of the distribution with only a small portion of the dataset. This practice of ignoring the majority of the available data is particularly concerning when trying to accurately infer any nonstationarities, as these patterns are strongly tied to the behavior in nonextreme observations. Nogaj et al. (2007), Eastoe and Tawn (2009), and Mentaschi et al. (2016)
acknowledged this issue and proposed preprocessing of the data with mean and variance functions to capture the nonstationarity. The processed data are assumed to be stationary, and standard GEV/GPD methods can then be used for inference. Analyses using this “transformedstationary” approach demonstrate the value of modeling nonstationary extremes without ignoring the bulk of the data, but this methodology is limited to a single tail at a time, and is accompanied by standard issues with GEV/GPD fitting, such as selection of a blocksize or threshold
(Coles, 2001; Scarrott and MacDonald, 2012).A recent model proposed by Stein (2020) fits the entire distribution and has flexible behavior in both tails. Other works have also attempted to bridge the tails and bulk of a distribution; see Scarrott and MacDonald (2012) for a review. These proposals largely rely on mixture models (Frigessi et al., 2002; Carreau and Bengio, 2009; Bopp and Shaby, 2017; Yadav et al., 2021)
or combining function composition with cumulative distribution functions
(Naveau et al., 2016; Tencaliec et al., 2020; Stein, 2021). Huang et al. (2019a) suggest a semiparametric approach incorporating loghistosplines. A number of the limitations of these approaches, such as flexible behavior in only one tail, restrictions to positive or heavytailed variables, or the need to numerically compute a normalizing constant, are obviated by the approach of Stein (2020), which provides a comprehensive approach to handle the bulk and both tails of a distribution.1.3 Proposed contributions
In this paper, we extend the recent work of Stein (2020) and demonstrate a flexible model for a random variable whose distribution changes over time. In particular, we extend the distribution from Stein (2020) to a seasonal model accounting for seasonality, longterm trends, and the interaction between these two characteristics. We illustrate the flexibility and ability of the model to capture changing distributions of ground measurements of daily surface temperature. In particular, the bulk, lower tail, and upper tail of temperature vary differently across seasons and over the longscale study period. A number of stations representing the diversity of seasonality and longterm changes in temperature are selected from across the United States. We highlight the capability of the proposed model to quantify these changing temperature patterns and extremes.
We outline the paper as follows: in Section 2, we describe the daily temperature data used in this study. In Section 3, we describe the model from Stein (2020) and its extension. Finally, in Section 4, we present results from applying our methodology to daily temperature data and compare it to alternatives in both the bulk and tails.
2 Surface air temperature data
Daily SAT used in this paper are provided by the National Climatic Data Center’s Global Surface Summary of the Day (GSOD) (NCEI, 2020). The GSOD database contains meteorological measurements from weather stations across the world. The R package GSODR (Sparks et al., 2017) offers a helpful interface to work with this dataset. We selected several U.S. cities from varying locations with different geographies, climates, and local conditions: Bethel, Alaska; Colorado Springs, Colorado; Minneapolis, Minnesota; Boston, Massachusetts; Hilo, Hawaii; San Diego, California; Blythe, California; and Homestead, Florida. Under the Köppen climate classification (Peel et al., 2007), Bethel is a subarctic climate (Dfc), Colorado Springs is a dry semi arid climate (Bsk) at high elevation, Minneapolis is a humid continental region (Dfa) with large seasonal variation, Boston is a mixture between humid continental climate (Dfa) and a humid subtropical climate (Cfa), Hilo is a tropical rainforest which receives significant rainfall (Af), San Diego is a mixture between a Mediterranean climate (Csa) and a semiarid climate (Bsh), Blythe is a hot desert climate (Bwh), and Homestead is a tropical savannah climate (Aw) that nears a tropical monsoon climate (Am). These eight locations are shown in Figure 1
. Time periods of the SAT observations range from the 1940’s to near present (2020). The records present missing data that are not imputed in this study. Data are recorded on an hourly basis and subsequently averaged to create daily summary values. Table
3 in Appendix A provides additional information about the SAT measurements at these eight stations.The selected stations across the U.S. exhibit different seasonal patterns of SAT, as shown in boxplots in Figure 2 (arranged very roughly in geographic order). Most stations experience a slow increase of temperature at the end of winter, followed by a faster temperature decrease at the end of the summer, indicating complex seasonality patterns not captured by a single annual harmonic. Bethel in particular sees long winters and short summers. The most northern and continental locations (i.e., all cities in the top row of Figure 2 and Blythe) exhibit the largest amplitude of seasonality. Coastal stations at Hilo, San Diego, and Homestead experience relatively warm temperatures year round; the climate at Hilo is especially uniform. San Diego has distinctive seasonality during the first part of the year (January to June), showing an almost linear increase in the temperature with an inflection around JuneJuly. Figure 2 also illustrates yeartoyear variability of SAT on each day of the year. Seasonal variability is evident in the bulk of the distributions, represented by the box heights, as well as in the tails. Variability tends to be larger in winter and is particularly strong in the most northern locations such as Bethel. Lower and upper tails often exhibit different seasonal variability, which is evident in the various whisker^{1}^{1}1See caption of Figure 2 for a description of boxplot features. lengths and the number of observations beyond the whiskers. In Bethel, many more observations lie outside the boxplot whiskers in the summer than in the winter. In Colorado Springs, values beyond the whiskers are more abundant and spread out in the lower tail than in the upper tail. Minneapolis and Homestead also present a lack of values beyond the whiskers in the upper tail during winter months, while Boston and Hilo show this behavior to a lesser extent in the lower tails during summer. The upper tail of San Diego sees the greatest number of observations beyond the whiskers, as well as the most significant spread in extremes. It is noteworthy that tail heaviness does not imply greater variability, as the SAT range in San Diego is the second smallest of the locations considered (larger only than that of Hilo). The observed asymmetries in tail behavior indicate a departure from Gaussianity and the need for nonstandard distribution models.
3 Seasonal model for bulk and tails of temperature distribution
In this section, we detail the statistical model used to fit the temperature data described in Section 2. The proposed model relies on a univariate model for bulk and tails (Section 3.1) and is extended to a seasonal model (Section 3.2) with parameters which are estimated by maximum likelihood (Section 3.3).
3.1 Model for bulk and tails
Stein (2020) proposed a flexible parametric model which we refer to as the bulkandtails (BATs) distribution. The “s” is included in the acronym to distinguish from “bulkandtail” models that tend to have limited flexibility in the lower tail. Consider the random variable whose cumulative distribution function (cdf) is given by , where is the cdf of a student random variable with degrees of freedom and is a monotoneincreasing function with six parameters that control the upper and lower tails. To be precise, define and
(1) 
where are the location, scale, and shape parameters of the upper and lower tails. Like in the GPD and GEV distributions, the parameters control the shape of the tails, with positive values producing a heavy tailed distribution and negative values producing a thin tailed distribution with bounded support in that tail. That is, if , the support has upper bound , while if , then . Similarly, if and if . The cases are defined by continuity. For example, when both shape parameters equal zero, reads
Taking the derivative of the BATs cdf with respect to , we obtain the probability distribution function (pdf) , where is the pdf of the student distribution with degrees of freedom.
Stein (2020)
showed that this distribution can behave like any threeparameter GPD in either tail. Derivatives of the BATs pdf with respect to model parameters can be calculated analytically, aiding maximum likelihood estimation. Moreover, there is no need to numerically compute any normalizing constant when writing the density, an obstacle often handled with Markov chain Monte Carlo methods
(Gelman and Meng, 1998; Møller et al., 2006). Although the BATs distribution does not directly follow from any firstorder limit theorems like in traditional extremes methodology, these properties produce a versatile and practical density for modeling purposes.In this work, we attempt to model nonstationary data by allowing the scale parameters and location parameters to change with time, both seasonally and with longterm trends. In some scenarios, it may be appropriate to allow and to also vary with time, but estimation for these parameters is already difficult when they are held constant. With a relatively simple parameterization for and , we fit our model to daily SAT records at the eight U.S. cities discussed in Section 2.
3.2 Seasonal extension with longterm trend
Here we describe the seasonal model used to fit the daily SAT data to seasonal variations with a longterm trend. To capture nonstationary behaviors in SAT, namely seasonality, climate change trends, and their interaction, we introduce covariates which allow parameters of the function (1) to depend upon time. Reasonable choices for seasonal covariates include harmonics or a periodic spline basis; we choose the latter for more flexibility in the main seasonal term, as Figure 2 shows evidence of complex seasonal patterns. For a climate change covariate, we use the logarithm of CO equivalent obtained from the PRIMAP emission time series (Gütschow et al., 2016) and freely available for download at https://www.pikpotsdam.de/parisrealitycheck/primaphist/. The PRIMAP time series is available on a yearly basis through 2018, and we regress PRIMAP values on the historic Mauna Loa CO dataset to predict the values at 2019 and 2020. In this study, the log CO equivalent serves as a proxy for climate change induced by greenhouse gases. Finally, as seasonal patterns can also be affected by the changing climate, an interaction term between seasonality and longterm trend is added, where the seasonality is modeled with annual harmonics.
Let represent the year and represent the Julian day since the first observation, modulo 365.25. Write to denote the ^{th} spline basis function at day and to denote the value of the log CO equivalent at year . Suppressing the subscripts for and , we use the following parameterizations:
(2) 
The longterm trend and its interaction with seasonality are only considered in the location parameters; we represent seasonality in the interaction term with circular harmonics for parsimony purposes. For simplicity, we use the same seasonal basis functions (periodic cubic splines from the R package pbs (Wang, 2013)) for the location and scale parameters. We fix the number of seasonal basis functions at eight. Using fewer basis functions at cities with more complicated seasonal temperature variations presented problems with the optimization convergence and also a worsened fit based on model diagnostics. We found that the choices made here provided a good overall compromise between capturing the nuances of the seasonal distributions at some sites and limiting the total number of parameters for reasons of computational stability and controlling model complexity.
3.3 Maximum likelihood estimation and its uncertainty
Suppose there are independent observations of temperature at a location. The corresponding likelihood is
(3) 
where is defined as in (1) and is the student pdf with degrees of freedom. Note that the likelihood (3) does not explicitly include the temporal dependence between daily observations, although the uncertainty quantification described in Section 3.3 does take this dependence into consideration. Ignoring the temporal dependence does not bias point estimates of marginal distributions of individual days (Varin et al., 2011). In total, there are 45 parameters which are estimated via maximum likelihood: 13 parameters each for the upper and lower locations, 8 parameters each for the upper and lower scales, two shape parameters, and the degrees of freedom . We enforced two constraints so that the likelihood is twicedifferentiable at its endpoints when is negative. Optimization was performed in Julia with the IPOPT solver (Biegler and Zavala, 2009) and automatic differentiation from ForwardDiff.jl (Revels et al., 2016) to efficiently obtain the gradient and Hessian. For our initial guess,
was set to 10, and all trendrelated parameters and shape parameters were initially set to zero. Initial seasonal coefficients for the location parameters were obtained from linear regression. Coefficients for the log scale parameters were initially set to zero, except for the intercept term, which was profiled over with a small grid of values.
Obtaining uncertainties of estimated parameters is not straightforward. We perform uncertainty quantification with a bootstrapping procedure accounting for temporal dependence. Using stratified block bootstrap resamples, it is possible to obtain confidence intervals for any desired function of the parameters with the percentile bootstrap method
(Efron and Tibshirani, 1993). Although classical bootstrapping operates under the assumption of independent and identically distributed data, the block bootstrap is commonly used in the setting of temporal dependence (Lahiri, 2003). Under the assumption that subannual temporal dependence is stronger than interannual dependence, selecting a block size of a year is natural choice, and we resample years based on a decadal stratification to preserve aspects of climate change seen over the observation period. That is, to create a bootstrapped dataset, we sample years (with replacement) from each decade, preserving the number of years in each decade from the original dataset. Maximum likelihood estimation is performed on each bootstrapped dataset, and the desired functions of parameters (e.g., quantiles, changes in quantiles over years, or parameters themselves) are computed for each fit. After ordering the bootstrapped quantities of interest from smallest to largest, pointwise 95% confidence intervals are obtained as the 2.5 and 97.5 percentiles of the ordered values. Bootstrap parameter estimates were obtained using the local optima from the fit on the full dataset as the initial guess. Example confidence intervals from 200 bootstrap samples are shown later in Figure 6 in Section 4.3 and in Table 4 in Appendix B. We emphasize that these uncertainty estimates are computed for the daily marginal distribution of SAT.4 Results on changing daily mean temperature distributions
In this section, we present results of the proposed model fit to daily mean SAT. Section 4.1 provides a visual evaluation of the quality of the fit models. Section 4.2
shows a quantitative comparison of the proposed model to several benchmark models. In particular, we compare with skewnormal and generalized Pareto distributions to respectively assess the bulk and each tail of the fit distribution. Finally, in Section
4.3, we highlight how the proposed model is able to capture the changing seasonal patterns due to longterm trends.To begin, we describe the benchmark distributions which are compared to the BATs distribution. First, with less of an emphasis on the tails of the distribution, we compare to a skewnormal distribution with timevarying location, scale, and skewness parameters. The skewnormal distribution was found to provide reasonable fits for temperatures in
Stein (2021). With and respectively denoting the pdf and cdf of a standard normal random variable, the skewnormal pdf is(4) 
where is a skewness parameter, is a location parameter, and is a scale parameter. We use the same splinebasis parameterization for the location and scale parameters as in (2) and also let the skewness vary in time using the same parameterization as the location parameter but without a climatechange covariate. Maximum likelihood estimation for the skewnormal model is performed in Julia.
To focus on the tails of the distribution, we compare with nonstationary GPD models. The GPD pdf is
(5) 
where is a userspecified threshold, is a shape parameter, and is a location parameter. If the shape parameter is negative, the tail is bounded; if the shape is nonnegative, the tail is unbounded. Specifically, the GPD support is where if and if We construct the GPD threshold as a quantile regression^{2}^{2}2See Appendix C for a description of quantile regression. at the quantile, using eight periodic spline basis functions as covariates. When considering the lower tail, we can multiply the data by and work in the typical peaksoverthreshold setting. For GPD parameters, we consider a constant shape parameter and a temporal scale parameter whose logarithm varies in time like the location parameter in (2). All GPD models were fit by maximum likelihood with the R package extRemes (Gilleland and Katz, 2016).
4.1 Seasonal quantile evaluation
Given the model parameter estimates, it is possible to express the distribution of daily mean SAT at any day and year, provided that the log CO equivalent is available for that year. First, in Figure 3, we examine estimated quantiles during the year 2020. Quantiles from the BATs model are shown for values 0.001, 0.01, 0.1 (blue), 0.25, 0.5, 0.75 (green), and 0.9, 0.99, 0.999 (red). The black lines in each panel are observed daily maximum, minimum, and medians taken across each day of the year over all years. These maximum and minimum values serve as a proxy for upper and lower tail descriptions which would require a large amount of data to quantify precisely. We also emphasize that these maximum and minimum values are taken from the entire time series, unlike the BATs curves that are shown for the year 2020. Overall, the fitted BATs model provides a very flexible representation across the year for the bulk and both tails of the distributions, capturing: a) larger spread in both tails and sometimes in the interquartile range during winter (e.g., Colorado Springs and Homestead); b) different spread in the lower and upper tails (e.g., San Diego); c) different seasonality patterns in each tail and in the interquartile range, such as in Bethel, where the upper and lower tails exhibit significantly different seasonal patterns; d) and asymmetric seasonal patterns across the year for the eight stations where the temperature increase at the end of the winter is slower than the decrease in early fall. Quantile regression can be used to study quantiles and tail behaviors in similar fashion to Figure 3, but this approach typically involves separate analyses of the desired quantiles and comes with the possibility of producing crossing quantile curves. To avoid crossing quantiles in quantile regression, practitioners will need to turn to specialized methods (He, 1997; Chernozhukov et al., 2010; Cannon, 2018). The use of the BATs model for the entire distribution eliminates the risk of crossing quantiles.
An interesting point in Figure 3 is the large spike in January in Hilo, which corresponds to heavy rainfall and flash flooding on January 12, 2020. In terms of the BATs model, this event corresponds to the quantile, which is extremely far in the estimated tail of the distribution given the approximately 50year duration of the Hilo data. After refitting with this value removed, the January 12 value unsurprisingly lies even further in the tail at the quantile, and the upper tail behavior (see Table 4) changes from to . While this example illustrates that a single outlying observation can have a drastic impact on tail behavior, it is noteworthy that the BATs lower tail behavior is unaffected by the January 12 2020 observation and remains around even after removal. Overall, the estimates appear to provide good fits to the seasonal evolution of SAT in the different locations considered.
To further explore the distributions across the year, we plot individual estimated monthly densities for the first day of each of January, April, July, and October (all in 2020) from parametric models (BATs and skewnormal) along with empirical kernel density estimates (KDE) in Figure
4. Seasonal BATs and skewnormal curves are shown for the year 2020, while the KDE curves come from a Gaussian kernel density estimate using observations from a fifteenday window over all available years.^{3}^{3}3See Appendix D for a more precise definition of the kernel density estimate. The three density curves at each location seem to largely agree with each other for all seasons and locations, but there are some notable differences. For example, in the winter and spring at Bethel, the KDE suggests a multimodal density which cannot be captured by the skewnormal estimate. In contrast, the BATs distribution has the ability to model bimodal distributions (as shown in Stein (2020)). The presence of pronounced shoulders to the distributions which the BATs distribution can model but the skewnormal cannot is also found in the winters of Minneapolis and Homestead. These results suggest that the seasonal BATs model can capture daily mean temperature distributions more reliably than a skewnormal distribution and provide flexible densities matching observed ones. A more quantitative comparison of the skewnormal and BATs models is continued in the next section, along with a closer examination of tail properties.4.2 Crossvalidation comparison
We further compare our methodology to two relevant models through a quantitative analysis. To compare entire distributions, we calculate the crossvalidated continuous ranked probability score (CRPS) (Gneiting and Raftery, 2007) of the BATs and skewnormal models over the entire year and for each of the four seasons individually. CRPS is typically used in the evaluation of probabilistic forecasts as a tool to assess calibration (statistical consistency between forecast and verification data) and sharpness (dispersion of the distribution to be evaluated). By definition, CRPS values are nonnegative, and smaller values indicate a better match. The CRPS between a single observation and cdf is defined as
(6) 
The CRPS is typically reported as the average of instantaneous CRPS as in (6) over a set of observations. Scores here are calculated in a crossvalidation procedure where folds are obtained from blocks of four or five (consecutive) years, leading to between 10 and 20 folds depending upon the station. A scaled version of the average difference (BATs minus skewnormal) for CRPS values within a given fold is shown as a percentage in Table 1. In all cities, the CRPS difference value is negative, indicating that the BATs model fits better over the whole year, but there are some combinations of seasons and cities where the skewnormal performs better. The most noticeable improvement provided by BATs is seen in Bethel, particularly during the wintertime, consistent with the visual evidence of skewnormal misfitting the winter months shown in Figure 4.
BET (AK)  BLY (CA)  BOS (MA)  COL (CO)  HIL (HI)  HOM (FL)  MIN (MN)  SAN (CA)  

Year  0.93  0.14  0.10  0.10  0.23  0.18  0.01  0.44 
DJF  3.05  0.27  0.05  0.07  0.13  0.12  0.18  0.22 
MAM  0.56  0.09  0.10  0.11  0.05  0.05  0.08  0.70 
JJA  0.02  0.48  0.15  0.21  0.24  0.12  0.08  0.42 
SON  0.17  0.09  0.10  0.01  0.49  0.39  0.12  0.40 
To compare the GPD and the nonthresholded models (BATs and skewnormal), we use a variant of the CRPS. A direct CRPS comparison is not sensible because the GPD only fits a single tail of a distribution. One option to deal with this is to change the cdf in the CRPS definition (6) to the cdf conditional on being larger than the threshold. Conditioning puts the cdf of the BATs model on the same range, , as the GPD model when looking above the threshold. However, this approach does not take into account the model’s ability to estimate the probability of being larger than the chosen threshold. Instead, we consider a new random variable
which is censored to equal the threshold when an observation falls below the threshold. For the BATs and skewnormal models, which fit the entire temperature distribution, this censored cdf equals
and for the GPD models, the censored cdf equals
As indicated by the common notation, the censoring value and the GPD threshold are identical, but in theory they could be different. Here, corresponds to the quantile regression level that creates the threshold .
To assess only tail behavior, we consider a weighted CRPS (Gneiting and Ranjan, 2011; Taillardat et al., 2019) where the weight function in the integral equals one if the observation lies in the tail and zero otherwise. To be precise, the indicatorweighted score expresses as
where is a censored cdf described above. The role of is to examine behavior of the distribution far in the tail by ignoring values smaller than . Both the GPD threshold and the wCRPS threshold are obtained from a crossvalidated quantile regression with eight periodic splines as covariates, with taken as the quantile and as the quantile, where . A similar procedure can be conducted to assess the lower tails of the distributions. Results from this comparison are shown in Table 2. For the most part, the BATs model and skewnormal appear to provide a slightly better fit than the GPD in this tail analysis. In Bethel, the skewnormal distribution fits the lower tail poorly, due primarily to its inability to reproduce behavior in DJF, as seen earlier in Figure 4 and Table 1. Besides the lower tail in Bethel, the BATs model and skewnormal are relatively comparable with a slight advantage to BATs. Boxplots for Colorado Springs in Figure 2 showed asymmetric behavior in the upper and lower tails, with the lack of observations beyond the upper boxplot whiskers indicating a nonGaussian upper tail. Accordingly, the BATs and skewnormal wCRPS values in the lower tail are quite similar, yet the BATs model provides a better fit in the upper tail. In cases where the performance of BATs is particularly poor—namely, the lower tails at Blythe and Homestead—the trouble largely comes from misfit in the first cross validation fold, where fitting was done on the holdout data which did not include initial observation years due to the configuration of the cross validation folds. The temperature records at these two locations had significant gaps after their earliest years, with Homestead missing over a decade, and Blythe missing over 30 years (see Appendix A).
BET (AK)  BLY (CA)  BOS (MA)  COL (CO)  HIL (HI)  HOM (FL)  MIN (MN)  SAN (CA)  

BATs  Skew  BATs  Skew  BATs  Skew  BATs  Skew  BATs  Skew  BATs  Skew  BATs  Skew  BATs  Skew  
0.95  0.53  0.42  0.51  0.87  0.13  0.03  0.46  0.30  4.08  2.89  2.41  2.38  0.28  0.25  1.20  1.12 
0.99  0.17  0.19  0.71  1.05  0.10  0.03  0.35  0.10  2.06  1.12  2.26  2.17  0.09  0.00  0.04  0.04 
0.995  0.13  0.17  0.84  1.19  0.06  0.06  0.36  0.18  1.28  0.63  2.06  1.89  0.09  0.01  0.18  0.00 
0.05  0.35  3.34  0.27  0.27  0.13  0.06  0.23  0.21  0.81  0.46  0.24  0.30  0.37  0.15  1.50  0.10 
0.01  0.06  13.38  0.91  0.26  0.05  0.02  0.09  0.09  0.50  0.03  0.18  0.04  0.30  0.90  0.82  0.15 
0.005  0.09  20.44  0.94  0.37  0.06  0.00  0.06  0.06  0.42  0.07  0.30  0.06  0.25  1.18  0.45  0.13 
4.3 Changing distributions over the years
We conclude our analysis by illustrating the ability of the proposed nonstationary BATs model to capture changes in SAT distributions and their seasonal patterns between years in a nonstationary climate. Our model explicitly accounts for a proxy of climate change through the log CO equivalent incorporated in the model parameters and through the interaction between seasonality and this longterm trend. First, in Figure 5, we show how estimated seasonal quantiles from the seasonal BATs model fits evolve over the period of the observational record. Specifically, the graphs use gradations of shading from grey to black to show time evolution of the differences of the estimated 0.001, 0.1, 0.5, 0.9, and 0.999 yearly quantiles from the 0.5 quantile at the starting year. Most combinations of quantiles and stations show a warming trend (the curves become darker with increasing time). However, the details of this trend vary substantially across locations, quantile levels, and time of the year. Bethel displays substantial warming, particularly in its lower quantiles at winter months. Spring and summer months in San Diego also experience a prominent increase in the 0.999 quantile. The longterm trend and its interaction with seasonal patterns can capture evolving seasonal patterns. Changes in seasonality over the years exhibit very different patterns depending upon station location and the quantile level considered. For instance, Hilo has the largest increases in median and hot temperatures over late summer and early fall, whereas cold quantiles at Bethel warm the most during winter. A notable departure from warming is seen at Hilo, Blythe, and Colorado Springs, where the 0.001 quantile shows a cooling trend in August through December. Many cold quantiles exhibit a less pronounced seasonality in the recent years compared with past years. In particular, the seasonality of the winter cold quantiles at warmer cities flattens over time; this phenomenon is less prominent in the warm quantiles.
Figure 6 provides uncertainty estimates for the changes in the 0.001 and 0.999 quantiles between the first and last years of the studied period. These results are obtained from the stratified block bootstrap procedure accounting for temporal dependence discussed in Section 3.3. Note that, in general, the uncertainty bounds are not symmetric around the estimated quantile change. In particular, San Diego exhibits a very large uncertainty associated with change in 0.999 quantile from April to June, consistent with the high variability in the upper tail observed in Figures 2 and 3 at that time of the year. Table 4 shows that the BATs model in San Diego favors a heavy upper tail, and the confidence interval for the shape parameter of the upper GPD tail also includes positive values, indicating that a heavy tail is a possibility. Typical shape parameter estimates for temperature distributions range from to 0 or at times are slightly positive (Nogaj et al., 2006; Gilleland and Katz, 2006; Bommier, 2014). Indeed, Figure 6 and confidence intervals from Table 4 suggest that the BATs estimate of the upper tail at San Diego may be too heavy. This large upper tail uncertainty in San Diego must be interpreted with caution, but overall, the block bootstrap helps us quantify uncertainty without neglecting temporal dependence or climate change.
Finally, we consider the yearly evolution of quantile differences across the study period (Figure 7), which enables us to quantify changes in the spread of bulk and tails of the distribution. Specifically, we examine the interquartile range and quantile differences in the tails to assess how the bulk spread and tail spreads change over time. For comparison with the BATs model fits, quantile estimates are also obtained from a quantile regression whose covariates are eight periodic splines to quantify seasonal patterns, a longterm trend to quantify climate change, and an interaction term for longterm change in seasonal patterns (i.e., same covariates as the location parameter in (2)). We observe a strong spatial variability in the behavior of quantile spreads. In general, the width of the bulk of the distribution decreases over time, with the IQR decreasing over time in most places (e.g., Bethel) but increasing elsewhere (e.g., Hilo). Quantile differences in both the near (orange and purple) and far (red and blue) tails experience different rates of changes for cold and warm temperature. Huang et al. (2016) and Rhines et al. (2017), respectively in climate projections and historical reanalysis, reported a decrease in variability of winter temperature in most North American regions. The far warm tails (red) often show a stretching trend, whereas the near cold tails (purple) tend to contract. For most locations considered, the far tail spreads (red and blue) are smaller than the near tail spreads (orange and purple), with some few exceptions. In Colorado Springs and Homestead, both lower tail spreads are larger than the upper tail spreads. In San Diego, both upper tail spreads are larger than the lower tail spreads, and the far upper tail spread has changed at a pace that rivals the interquartile range. At Hilo, all measures of spread appear to be increasing over time. For the most part, results from BATs and quantile regression agree with one another in capturing these aspects of SAT. The largest discrepancy is in Blythe, where the BATs model lacks the increase in far lower tail spread and decrease in far upper tail spread found in results for the quantile regression. This difference seems to result from the many years of missing data at Blythe, as a stationary quantile regression for each decade (shown with horizontal lines) does not exhibit this crossing of far upper and lower tail spreads. Other stations also display a general characteristic of “flatness” in the BATs results when compared to the corresponding estimates from quantile regression. Allowing other parameters in the BATs model to depend upon log CO may help capture more subtle yeartoyear changes in distribution.
5 Discussion and Conclusions
In this paper, we have applied the seasonal bulkandtails (BATs) model with a longterm trend to daily mean surface air temperature data from eight cities in the United States. The model has demonstrated its ability to capture different seasonality and climate change patterns as well as different behaviors in the upper and lower tails. A crossvalidation comparison shows superior fit of the BATs distribution to the skewnormal distribution on the entire distribution at all cities and to the generalized Pareto distribution in both tails at many cities.
For practitioners, this model offers a great flexibility with a closedform density that is relatively easy to work with. Only a few statistical models proposed in extremes literature possess a density describing the bulk and both tails (Naveau et al., 2016; Tencaliec et al., 2020; Stein, 2020, 2021). Extreme value analysis is still traditionally performed with blockmaxima or peakoverthreshold techniques, which limits analysis to a tail at a time and avoids an allencompassing model that describes the entire distribution (Coles, 2001). The proposed framework also allows parameters in the BATs distribution to change over time, both seasonally and longterm. A stratified block bootstrap procedure was conducted for uncertainty quantification.
Of the seven parameters in the standard BATs model, we have taken the shape parameters and and the smoothness to be constant over time. This choice was made due to the difficulty of estimating the shape and smoothness even when these parameters are constant. To capture longterm variation of the nonstationary parameters, we included an annual log CO equivalent covariate in the location parameters. Other parameters could depend upon this covariate, particularly the scale parameters, which would reflect a longterm change in the temperature variability, or the shape parameters, which would reflect longterm changes in tail behavior. Figure 7 indicates that it may be worth allowing other parameters to depend on the longterm trend and its seasonal interaction. Although the logarithm of CO equivalent serves as a proxy for climate change induced by greenhouse gases, it is important to note that not all longterm changes in this paper are due to globalscale anthropogenic climate change. Some sites may be affected by localized changes, particularly the larger cities and those with measurements taken at airports.
The proposed model provides a flexible statistical tool to examine the bulk and tails of distributions changing over time. We have quantified how different parts of the SAT distribution (i.e., different quantiles) exhibit different seasonal and longterm changes at cities representative of different geographies and local climates. Hot and cold quantiles tend to experience larger longterm changes than median quantiles and also present different seasonal patterns. In particular, studied locations with warmer climates experience the most warming for hot quantiles in summer, and locations with colder climate experience the most warming for cold quantiles in winter.
Finally, the proposed model is fit to observations while neglecting spatial and temporal dependence. Modeling multidimensional extremes (as would be needed to explicitly account for spatial and/or temporal dependence) is an ongoing challenge for the community (Huser and Wadsworth, 2020), where various techniques have been proposed, relying on hierarchical models (Gaetan and Grigoletto, 2007), copulas (Lee and Joe, 2018; Krupskii and Genton, 2021), and conditional modeling (Wadsworth and Tawn, 2019; Simpson and Wadsworth, 2021; Huang et al., 2021). Most of these methods ignore the bulk of the data or only focus on a single tail. Developing a bulkandtails model for multiple variables is a formidable task because of the wide range of behaviors that can occur in distributions for multivariate extremes (Huser and Genton, 2016; Huser et al., 2017; Huser and Wadsworth, 2019; Huang et al., 2019b; Davison et al., 2013).
Acknowledgements
Part of the effort of Mitchell Krock, Julie Bessac and Michael Stein is based in part on work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research (ASCR) under Contract DEAC0206CH11347. Adam Monahan acknowledges the support of the Natural Sciences and Engineering Research Council of Canada (NSERC) (funding reference RGPIN2019204986).
References
 Largescale nonlinear programming using ipopt: an integrating framework for enterprisewide dynamic optimization. Computers & Chemical Engineering 33, pp. 575–582. External Links: ISBN 9780444531575, Document Cited by: §3.3.
 Peaksoverthreshold modelling of environmental data. Ph.D. Thesis, Uppsala University. Cited by: §4.3.
 An exponential–gamma mixture model for extreme Santa Ana winds. Environmetrics 28 (8), pp. e2476. Cited by: §1.2.
 Cascading risks: understanding the 2021 winter blackout in texas. Energy Research & Social Science 77, pp. 102106. External Links: ISSN 22146296, Document, Link Cited by: §1.1.

Noncrossing nonlinear regression quantiles by monotone composite quantile regression neural network, with application to rainfall extremes
. Stochastic Environmental Research and Risk Assessment 32, pp. . External Links: Document Cited by: §4.1.  A hybrid Pareto model for asymmetric fattailed data: the univariate case. Extremes 12 (1), pp. 53–76. Cited by: §1.2.
 Nonstationary extreme value analysis in a changing climate. Climatic change 127 (2), pp. 353–369. Cited by: §1.1.
 Quantile and probability curves without crossing. Econometrica 78 (3), pp. 1093–1125. External Links: ISSN 00129682, Document, Link, MathReview (Chengyong Tang) Cited by: §4.1.
 An introduction to statistical modeling of extreme values. Springer Series in Statistics, SpringerVerlag London, Ltd., London. External Links: ISBN 1852334592, Document, Link, MathReview Entry Cited by: §1.2, §1.2, §5.
 Geostatistics of dependent and asymptotically independent extremes. Math. Geosci. 45 (5), pp. 511–529. External Links: ISSN 18748961, Document, Link, MathReview Entry Cited by: §5.
 Models for exceedances over high thresholds. J. Roy. Statist. Soc. Ser. B 52 (3), pp. 393–442. Note: With discussion and a reply by the authors External Links: ISSN 00359246, MathReview Entry Cited by: §1.2.
 Modelling nonstationary extremes with application to surface level ozone. Journal of the Royal Statistical Society: Series C (Applied Statistics) 58 (1), pp. 25–45. External Links: Document, Link, https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.14679876.2008.00638.x Cited by: §1.2.
 An introduction to the bootstrap. Monographs on Statistics and Applied Probability, Vol. 57, Chapman and Hall, New York. External Links: ISBN 0412042312, Document, Link, MathReview (Michael Falk) Cited by: §3.3.
 Extreme values in finance, telecommunications, and the environment. Chapman & Hall/CRC. Cited by: §1.1.
 A dynamic mixture model for unsupervised tail estimation without threshold selection. Extremes 5 (3), pp. 219–235 (2003). External Links: ISSN 13861999, Document, Link, MathReview Entry Cited by: §1.2.
 Improving the simulation of extreme precipitation events by stochastic weather generators. Water Resources Research 44 (12), pp. . External Links: Document, Link, https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2008WR007316 Cited by: §1.2.
 A hierarchical model for the analysis of spatial rainfall extremes. Journal of agricultural, biological, and environmental statistics 12 (4), pp. 434–449. Cited by: §5.
 Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Statist. Sci. 13 (2), pp. 163–185. External Links: ISSN 08834237, Document, Link, MathReview Entry Cited by: §3.1.
 Quantifying the risk of extreme events under climate change. Chance 30 (4), pp. 30–36. Cited by: §1.1.
 Analyzing seasonal to interannual extreme weather and climate variability with the extremes toolkit. 18th Conference on Climate Variability and Change, 86th American Meteorological Society (AMS) Annual Meeting 29, pp. . Cited by: §4.3.
 extRemes 2.0: An Extreme Value Analysis Package in R. Journal of Statistical Software 72 (1), pp. 1–39. External Links: Document Cited by: §4.
 Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association 102 (477), pp. 359–378. External Links: ISSN 01621459, Document, Link, MathReview Entry Cited by: §4.2.
 Comparing density forecasts using threshold and quantileweighted scoring rules. J. Bus. Econom. Statist. 29 (3), pp. 411–422. External Links: ISSN 07350015, Document, Link, MathReview (Yuzo Hosoya) Cited by: §4.2.
 North American extreme temperature events and related large scale meteorological patterns: a review of statistical methods, dynamics, modeling, and trends. Climate Dynamics 46 (3), pp. 1151–1184. Cited by: §1.1.
 The PRIMAPhist national historical emissions time series. Earth System Science Data 8 (2), pp. 571–603. External Links: Document Cited by: §3.2.
 Global surface temperature change. Reviews of Geophysics 48 (4). Cited by: §1.1.
 Estimating changes in temperature distributions in a large ensemble of climate simulations using quantile regression. Journal of Climate 31 (20), pp. 8573–8588. Cited by: §1.1.
 Quantile curves without crossing. The American Statistician 51 (2), pp. 186–192. External Links: Document, Link Cited by: §4.1.
 Estimating concurrent climate extremes: A conditional approach. Weather and Climate Extremes 33, pp. 100332. External Links: ISSN 22120947, Document Cited by: §5.
 Estimating precipitation extremes using the loghistospline. Environmetrics 30 (4), pp. e2543. Cited by: §1.2.
 New exploratory tools for extremal dependence: networks and annual extremal networks. Journal of Agricultural, Biological and Environmental Statistics 24 (3), pp. 484–501. Cited by: §5.
 Estimating changes in temperature extremes from millennialscale climate simulations using generalized extreme value (GEV) distributions. Advances in Statistical Climatology, Meteorology and Oceanography 2 (1), pp. 79–103. Cited by: §1.1, §1.1, §4.3.
 Nonstationary dependence structures for spatial extremes. J. Agric. Biol. Environ. Stat. 21 (3), pp. 470–491. External Links: ISSN 10857117, Document, Link, MathReview Entry Cited by: §5.
 Bridging asymptotic independence and dependence in spatial extremes using Gaussian scale mixtures. Spat. Stat. 21 (part A), pp. 166–186. External Links: Document, Link, MathReview Entry Cited by: §5.
 Modeling spatial processes with unknown extremal dependence class. J. Amer. Statist. Assoc. 114 (525), pp. 434–444. External Links: ISSN 01621459, Document, Link, MathReview Entry Cited by: §5.
 Advances in statistical modeling of spatial extremes. Wiley Interdisciplinary Reviews: Computational Statistics, pp. e1537. Cited by: §5.
 US daily temperatures: The meaning of extremes in the context of nonnormality. Journal of Climate 27 (19), pp. 7368–7384. Cited by: §1.1.
 Climate change 2021: the physical science basis. contribution of working group i to the sixth assessment report of the intergovernmental panel on climate change. Cambridge University Press. In Press.. Cited by: §1.1, §1.1.
 Extreme events in a changing climate: variability is more important than averages. Climatic change 21 (3), pp. 289–302. Cited by: §1.1.
 Quantreg: quantile regression. Note: R package version 5.85 External Links: Link Cited by: Appendix C.
 Conditional normal extremevalue copulas. Extremes, pp. 1–29. Cited by: §5.
 Resampling methods for dependent data. Springer Series in Statistics, SpringerVerlag, New York. External Links: ISBN 0387009280, Document, Link, MathReview (M. Hušková) Cited by: §3.3.
 Multivariate extreme value copulas with factor and tree dependence structures. Extremes 21 (1), pp. 147–176. Cited by: §5.
 Mean seasonal and spatial variability in global surface air temperature. Theoretical and applied climatology 41 (1), pp. 11–21. Cited by: §1.1, §1.1.
 The changing shape of northern hemisphere summer temperature distributions. Journal of Geophysical Research: Atmospheres 121 (15), pp. 8849–8868. Cited by: §1.1.
 The spatial structure of the annual cycle in surface temperature: amplitude, phase, and lagrangian history. Journal of Climate 26 (20), pp. 7852–7862. Cited by: §1.1.
 Relative increase of record high maximum temperatures compared to record low minimum temperatures in the US. Geophysical Research Letters 36 (23). Cited by: §1.1.
 The transformedstationary approach: a generic and simplified methodology for nonstationary extreme value analysis. Hydrology and Earth System Sciences 20 (9), pp. 3527–3547. External Links: Link, Document Cited by: §1.2.
 An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants. Biometrika 93 (2), pp. 451–458. External Links: ISSN 00063444, Document, Link, MathReview Entry Cited by: §3.1.
 Modeling jointly low, moderate, and heavy rainfall intensities without a threshold selection. Water Resources Research 52 (4), pp. 2753–2769. Cited by: §1.2, §5.
 Global Surface Summary of Day (GSOD), National Centers for Environmental Information, National Oceanic and Atmospheric Administration, U.S. Department of Commerce. https://www7.ncdc.noaa.gov/CDO/GSOD_DESC.txt. Cited by: §2.
 Amplitude and frequency of temperature extremes over the North Atlantic region. Geophysical Research Letters 33 (10), pp. L10801. External Links: Link, Document Cited by: §4.3.
 Nonstationary extreme models and a climatic application. Nonlinear Processes in Geophysics 14 (3), pp. 305–316. External Links: Link, Document Cited by: §1.2.
 Updated world map of the KoppenGeiger climate classification. Hydrology and Earth System Sciences Discussions 4, pp. . External Links: Document Cited by: §2.
 Rapid attribution analysis of the extraordinary heatwave on the Pacific Coast of the US and Canada June 2021. Note: https://www.worldweatherattribution.org/westernnorthamericanextremeheatvirtuallyimpossiblewithouthumancausedclimatechange/ Cited by: §1.1.
 Estimating trends in the global mean temperature record. Advances in Statistical Climatology, Meteorology and Oceanography 3 (1), pp. 33–53. Cited by: §1.1.
 Increase of extreme events in a warming world. Proceedings of the National Academy of Sciences 108 (44), pp. 17905–17909. Cited by: §1.1.
 Statistical analysis of extreme values with applications to insurance, finance, hydrology and other fields. Third edition, Birkhäuser Verlag, Basel. Note: With 1 CDROM (Windows) External Links: ISBN 9783764372309, MathReview Entry Cited by: §1.1.
 Forwardmode automatic differentiation in julia. CoRR abs/1607.07892. External Links: Link, 1607.07892 Cited by: §3.3.
 Seasonally resolved distributional trends of North American temperatures show contraction of winter variability. Journal of Climate 30 (3), pp. 1139–1157. Cited by: §1.1, §4.3.
 Nonstationary extreme value analysis for event attribution combining climate models and observations. Advances in Statistical Climatology, Meteorology and Oceanography 6 (2), pp. 205–221. External Links: Link, Document Cited by: §1.1.
 A review of extreme value threshold estimation and uncertainty quantification. REVSTAT 10 (1), pp. 33–60. External Links: ISSN 16456726, MathReview Entry Cited by: §1.2, §1.2.
 HeatRelated Emergency Department Visits During the Northwestern Heat Wave. Cited by: §1.1.
 Simulation of extreme weather events by a stochastic weather generator. Climate Research  CLIMATE RES 35, pp. 203–212. External Links: Document Cited by: §1.2.
 Weather and climate extreme events in a changing climate. in: climate change 2021: the physical science basis. contribution of working group i to the sixth assessment report of the intergovernmental panel on climate change. Cambridge University Press. In Press.. Cited by: §1.1.
 A reliable databased bandwidth selection method for kernel density estimation. J. Roy. Statist. Soc. Ser. B 53 (3), pp. 683–690. External Links: ISSN 00359246, MathReview Entry Cited by: Appendix D.
 Conditional modelling of spatiotemporal extremes for red sea surface temperatures. Spatial Statistics 41, pp. 100482. Cited by: §5.

GSODR: Global Summary Daily Weather Data in R.
The Journal of Open Source Software
2, pp. . External Links: Document Cited by: §2.  A parametric model for distributions with flexible behavior in both tails. Environmetrics. Cited by: §1.2, §1.3, §1.3, §3.1, §3.1, §4.1, §5.
 Parametric models for distributions when interest is in extremes with an application to daily temperature. Extremes 24 (2), pp. 293–323. External Links: ISSN 13861999, Document, Link, MathReview Entry Cited by: §1.2, §4, §5.
 Extreme events evaluation using CRPS distributions. External Links: 1905.04022 Cited by: §4.2.
 Statistical explanation for trends in extreme summer temperatures at Phoenix, Arizona. Journal of Climate 8 (6), pp. 1704–1708. Cited by: §1.1.
 Flexible semiparametric Generalized Pareto modeling of the entire range of rainfall amount. Environmetrics 31 (2), pp. e2582. Cited by: §1.2, §5.
 An overview of composite likelihood methods. Statist. Sinica 21 (1), pp. 5–42. External Links: ISSN 10170405, MathReview Entry Cited by: §3.3.
 Higherdimensional spatial extremes via singlesite conditioning. arXiv preprint arXiv:1912.06560. Cited by: §5.
 Pbs: periodic b splines. Note: R package version 1.1 External Links: Link Cited by: §3.2.
 Characterization of long period return values of extreme daily temperature and precipitation in the CMIP6 models: Part 2, projections of future change. Weather and Climate Extremes 30, pp. 100284. Cited by: §1.1.
 Spatial hierarchical modeling of threshold exceedances using rate mixtures. Environmetrics 32 (3), pp. e2662. Cited by: §1.2.
Appendix A Complementary Data Information
%  Start  Missing  

BET (AK)  89  1945  
BLY (CA)  86  1942  19451972 
BOS (MA)  95  1943  
COL (CO)  100  1942  19651972 
HIL (HI)  99  1973  
HOM (FL)  94  1943  19461955, 19711972, 20002004 
MIN (MN)  100  1945  19651972 
SAN (CA)  98  1945 
Appendix B Parameter Estimates
BET (AK)  0.019  (0.031, 0.027)  0.048  (0.144, 0.008)  0.234  (0.328, 0.184)  0.146  (0.235, 0.130) 

BLY (CA)  0.034  (0.068, 0.013)  0.005  (0.028, 0.016)  0.222  (0.326, 0.164)  0.103  (0.188, 0.049) 
BOS (MA)  0.245  (0.327, 0.215)  0.190  (0.258, 0.149)  0.188  (0.285, 0.177)  0.221  (0.288, 0.177) 
COL (CO)  0.088  (0.134, 0.023)  0.156  (0.268, 0.071)  0.182  (0.251, 0.139)  0.165  (0.208, 0.096) 
HIL (HI)  0.051  (0.084, 0.008)  0.008  (0.036, 0.019)  0.242  (0.331, 0.176)  0.093  (0.169, 0.057) 
HOM (FL)  0.002  (0.052, 0.033)  0.009  (0.047, 0.017)  0.215  (0.285, 0.167)  0.104  (0.243, 0.060) 
MIN (MN)  0.220  (0.273, 0.157)  0.251  (0.300, 0.209)  0.187  (0.241, 0.143)  0.235  (0.303, 0.191) 
SAN (CA)  0.003  (0.141, 0.034)  0.119  (0.022, 0.239)  0.122  (0.217, 0.117)  0.006  (0.086, 0.013) 
Appendix C Quantile Regression
The ^{th} quantile ) of a random variable with cdf is defined as . Quantile regression is used to estimate quantiles as a linear function of covariates ; that is, , where is a
matrix of covariates. Estimates of the coefficient vector are given by
where is the ^{th} row of and . All quantile regressions were fit with the R package quantreg (Koenker, 2021).
Appendix D Kernel Density Estimate
Suppose we are interested in the KDE at day of year . Let be the window of days centered at for the first observation year, say 1943. With , the KDE formula reads
where is a Gaussian kernel function and is a bandwidth. Estimates were obtained from the density R command with bandwidth selected according to Sheather and Jones (1991).
Comments
There are no comments yet.