Log In Sign Up

A Model-Based General Alternative to the Standardised Precipitation Index

In this paper, we introduce two new model-based versions of the widely-used standardized precipitation index (SPI) for detecting and quantifying the magnitude of extreme hydro-climatic events. Our analytical approach is based on generalized additive models for location, scale and shape (GAMLSS), which helps as to overcome some limitations of the SPI. We compare our model-based standardised indices (MBSIs) with the SPI using precipitation data collected between January 2004 - December 2013 (522 weeks) in Caapiranga, a road-less municipality of Amazonas State. As a result, it is shown that the MBSI-1 is an index with similar properties to the SPI, but with improved methodology. In comparison to the SPI, our MBSI-1 index allows for the use of different zero-augmented distributions, it works with more flexible time-scales, can be applied to shorter records of data and also takes into account temporal dependencies in known seasonal behaviours. Our approach is implemented in an R package, mbsi, available from Github.


page 12

page 14

page 15

page 16


On some new neighbourhood degree based indices

In this paper, four novel topological indices named as neighbourhood ver...

A New Index for Clustering Evaluation Based on Density Estimation

A new index for internal evaluation of clustering is introduced. The ind...

Characterization and Development of Average Silhouette Width Clustering

The purpose of this paper is to introduced a new clustering methodology....

The minimum value of the Colless index

The Colless index is one of the oldest and most widely used balance indi...

Non-Stationary Large-Scale Statistics of Precipitation Extremes in Central Europe

Extreme precipitation shows non-stationary behavior over time, but also ...

A Bayesian Hierarchical Time Series Model for Reconstructing Hydroclimate from Multiple Proxies

We propose a Bayesian model which produces probabilistic reconstructions...

1 Introduction

Mitigating the effects of climate change on health and disease is one of the greatest challenges to public health and international development (McMichael, 2013; Watts et al., 2015). One of the main characteristics of the burden of climate change is the expected increase in the frequency, intensity and duration of extreme climate events (Houghton et al., 2001; Rosenzweig et al., 2001). These are events experiencing extreme values of meteorological variables, they often cause damage and are defined as either taking maximum values or exceeding established high thresholds (Stephenson, 2008). In this paper, our focus is on floods and droughts, which are considered extreme hydro-climatic events because they are related to the tails of streamflow distribution (Shelton, 2009).

The impact of extreme hydro-climatic events is not straightforward to understand because they comprise a complex web of direct and indirect impacts on environmental, economic and social areas (Blanka et al., 2017). Floods and droughts, depending on their severity, can produce not only crucial damage to the economy and ecology of a region, but also lives can be endangered (Lehner et al., 2006). Agriculture and associated sectors are highly dependent on surface and ground water; hence, it is common to see major impacts of droughts and floods on these areas (Blanka et al., 2017). The impact of extreme hydro-climatic events will also crucially depend on specific characteristics of the society affected like their vulnerability, adaptive capacity and resilience (Seiler et al., 2002; World Meteorological Organization (WMO) and Global Water Partnership (GWP), 2016). Hence, focus should be put on vulnerable societies that are prone to experience extreme hydro-climatic events; for example, on roadless urban centers of the Brazilian Amazonia, where the population is experiencing droughts and floods without precedent (see Zeng et al., 2008; Chen et al., 2010; Filizola et al., 2014; Lewis et al., 2011).

In this context, the importance of being able to identify extreme hydro-climatic events is due to two main reasons. First, it can help to improve the understanding of the effects of floods and droughts by allowing the analysis of extreme hydro-climatic events with respect to different variables or indicators of interest in health, economy or others. For example, Chacón-Montalván et al. (2018) evaluates the effects of these events on newborn health measured through birthweight. Second, the methodology for the identification of extreme events can help to improve monitoring and prediction tools and, potentially, enhancing prevention policies to reduce the impacts of floods and droughts.

There are a large number of indices and indicators for monitoring droughts. World Meteorological Organization (WMO) and Global Water Partnership (GWP) (2016)

presented 49 indicators and indices classified among the categories meteorology, soil moisture, hydrology, remote sensing, and composite or modelled. Between these indices, the most common are the standardized precipitation index (SPI), the Palmer drought severity index (PDSI), the crop moisture index, the surface water supply index and the vegetation condition index

(Mishra and Singh, 2010). Comparisons between these indices, often, agree that the standardized precipitation index is an appealing index for monitoring droughts because of its simplicity, spatial invariance, probabilistic nature and its flexibility to work with different time-scales (Guttman, 1999; Hayes et al., 1999; Morid et al., 2006; Mishra and Singh, 2010). In addition, the World Meteorological Organization has suggested to use the SPI as a primary meteorological drought index through Hayes et al. (2011) and a user guide for this index has been released in World Metereological Organization (2012).

In the case of flood monitoring, most of studies focusses on more than one indicator given that it is not only related with rainfall, but also with river levels, river discharge and geomorphology. In comparison with the case of droughts, there is not much consensus in which indices or information to use for monitoring floods. Koriche and Rientjes (2016), for example, used rainfall and topography to propose a satellite based index, while Ban et al. (2017) used satellite-based RGB composite imagery. Other approaches applied sensor networks or information from hydrological stations (Keoduangsine and Goodwin, 2012). Despite this variability of methodologies, several studies are recognizing the potential value of the SPI as a tool for flood monitoring. For instance, Wang et al. (2017) demonstrated that the 2-month SPI is an effective indicator for identifying major floods events in the Minjiang River basin. Similarly, Seiler et al. (2002); Guerreiro et al. (2008); Du et al. (2013); Koriche and Rientjes (2016) have used the SPI for flood predicting systems.

Motivated by the desire to evaluate the impacts of extreme hydro-climatic events on birthweight in the Brazilian Amazon, (see Chacón-Montalván et al., 2018), our research initially explored the use of the widely applied standardized precipitation index (SPI). However, although this index has been suggested as the primary meteorological drought index by the World Meteorological Organization and has been shown to be useful for identifying and monitoring droughts and floods, the current methodology for computing it has certain limitations that will be explained in Section 2.3. For instance, the SPI can not be computed reliably for series shorter than 30 years. For this reason, we propose two model-based approaches that maintain the desirable characteristics of the SPI but with improved computation and methodology.

Our model-based standardized indices (herein, MBSIs) overcomes some of the limitations of the SPI by using generalized additive models for location, scale and shape (GAMLSS). These models are flexible enough to capture the seasonal trend on the parameters of the distribution of rainfall or precipitation data. Our methodology differs form other attempts to improve the SPI by proposing a model-based approach instead of proposing a group of empirical steps to compute the SPI such as presented in Erhardt and Czado (2017). In addition, the MBSIs could be applied to other environmental variables of interest, other than precipitation, by choosing an appropriate family of distributions.

This paper is structured as follows. An introduction explaining the motivation for an alternative to the SPI is given in the present section. Then, the definition and limitations of the SPI are presented in Section 2. In Section 3, we provide a short introduction to generalized additive models for location, scale and shape (GAMLSS). In Section 4, two model-based approaches to compute the standardized precipitation index are proposed to tackle some of these limitations and make it possible the use of a theoretically similar index on our study for birthweight (see Chacón-Montalván et al., 2018). After presenting the MBSIs, in Section 5, we compare the SPI and MBSIs using precipitation data collected between January 2004 - December 2013 in Caapiranga, a road-less municipality in the Amazonas State. Finally, conclusions and a discussion of the performance of our method is given in Section 6.

2 Standardised Precipitation Index

The SPI is an index that was proposed by Mckee et al. (1993) to improve drought detection and monitoring capabilities using statistical concepts. The main characteristics of this index is simplicity, spatial invariance, probabilistic nature and flexibility to work with different time-scales (Guttman, 1999). This last characteristic allows monitoring of different types of droughts like agricultural (short time-scale) and hydrological (long time-scale) (Mckee et al., 1993).

Therefore, to compute the SPI, it is necessary to choose a time-scale over which to smooth the original precipitation data; this smoothing enables the method to detect extreme events that occur over a period of continuous time. The computation of the SPI continues by evaluating the cumulative distribution function for a particular value of the smoothed precipitation, taking in consideration the seasonality, and mapping this probability to the corresponding quantile of a standard normal distribution; which is the main idea behind the SPI to quantify how extreme are the observed values with respect to the usual seasonal behaviour. The resulting series of values are interpretable in the usual manner: as quantiles from a standard normal distribution. For example, an SPI value of 2 indicates that the probability of observing an event at least as extreme as this is 0.0228. In the next sections, we describe the computation of the SPI with further detail (Section

2.1), present the approach to monitor floods and droughts using the SPI (Section 2.2), and discuss some limitations of the SPI (Section 2.3).

2.1 Definition of the SPI

In this section, we outline the methodology of Mckee et al. (1993) for computing the SPI for a monthly time series of precipitation, represented as a discrete-time stochastic process, . Throughout this section we will refer to as the ‘monthly precipitation’, but the reader should bare in mind that we intend to be thought of in more general terms because the methodology can, in theory, be easily applied to other variables such as river levels, river discharge, etc.

We begin by defining as the -order moving average process of such as


i.e. is the average of the observed precipitation of the last months, inclusive of the present month . In the literature of drought indices, is referred to as the ‘time-scale’ under study. The ability to define prior to analysis is considered one of the appealing characteristics of the SPI (Guttman, 1998).

Rather than employing formal statistical methods for selecting , the choice of is determined by the time-scale under consideration by the researcher. For example, if one is interested in detecting droughts that occur over long periods of time (e.g. during a year), then might be chosen; similarly for analysing quarterly droughts might be more appropriate. The choice of time-scale can be related to the particular type of drought impact of interest. Different values of shift the focus of an analysis to different types of extreme events; this is important given that the lack of water in the short, medium or long-term affects different sections of human society and the surrounding ecosystem in different ways (e.g agricultural or hydrological effects) (Mckee et al., 1993). In the interest of disaster prevention, or planning a humanitarian response to a drought, the actions taken will be different for droughts at different time scales. For instance, events occurring on a short time-scale may be important to agricultural decisions whereas events on longer time-scales may be of more relevance for the management of water supplies (Guttman, 1998, 1999).

To continue with the definition of the SPI, it is beneficial to switch notation for the subscript , replacing and by respectively and , where is the year and is the month under study. We next introduce a statistical model for , i.e. a parametric density function, , where is an arbitrary value on the domain of . Notice that the notation implies that the characteristics of the density function change according to the month of the year, i.e. it has a seasonal behaviour. In the original article, Mckee et al. (1993) suggested a gamma density for , but current practice instead makes use of a mixture, a zero-augmented gamma density (ZAGA), which allows take zero values (Lloyd-Hughes and Saunders, 2002).

Define , the probability that the smoothed precipitation is zero on the month , and let the density function of for be , a gamma density with parameters evaluated at . Thus the density function of the moving average process is a zero-augmented gamma density defined as


where is an indicator function. Hence, the cumulative distribution function of is



denotes the distribution function for a gamma random variable with parameters


A key point we will revisit in the sequel is that the parameters and in Equations 2 and 3 vary from month to month, but not between years, so they are able to capture annual seasonal behaviours. The methodology of Mckee et al. (1993) thus partitions into twelve independent series of the form for

. Parameter estimation for each month,

and , is done independently by fitting a realisation of , i.e. , to the zero-augmented gamma density in Equation 2.

Values of the standardized precipitation index (SPI) are then obtained by computing the quantiles for a standard normal density with probabilities . As mentioned before, the interpretation of obtained SPI values are as the one of a standard normal distribution, e.g. values greater than or lower than can be considered extreme values, while values close to zero are likely to happen.

Provided fits the data well, for each , the probability integral transform implies we should expect the collection to follow a standard uniform density; the back-transform using the inverse cumulative distribution function of a standard Gaussian is therefore redundant, beyond relating to quantiles of a density commonly used in statistical practice.

Hence, the proposed method of Mckee et al. (1993) to compute the SPI can be summarized as:

  1. [1), nolistsep]

  2. Define the time-scale to work with (e.g. 1 month, 3 months, etc).

  3. Compute the -order moving average series using all the precipitation time series .

  4. Split the moving average series into months to obtain , , , .

  5. For each month , obtain the estimates and by fitting the realization of , i.e. , to the density function on Equation 2. Maximum likelihood estimation can be used for this step.

  6. Evaluate the cumulative density function to the observed values of the moving average process to obtain the collection .

  7. Obtain the values for the SPI by computing the quantiles of a standard normal distribution with probabilities .

2.2 Flood and Drought Monitoring

For drought monitoring, Mckee et al. (1993) defined an episode of drought as a period of time in which the SPI is continuously negative reaching at least one value lower or equal to . Then, it is said that the beginning of the drought is the first time that the SPI falls below zero and it finishes when a positive SPI is reached after observing a value lower or equal to (Mckee et al., 1993). Similarly, a flood can be defined as a period of time where the SPI is continuously positive reaching at least one value greater or equal to . Further characteristics of these events, such as magnitude and intensity, can be computed to improve drought monitoring. For example, the magnitude has been defined as the absolute value of the sum of the SPI during the period of the drought/flood, while the intensity can be classify as shown in table 1 (Mckee et al., 1993; Wang et al., 2017).

Category Value
extreme flood
severe flood
moderate flood
near normal
moderate drought
severe drought
extreme drought
Table 1: Intensity of droughts and floods based on the SPI

2.3 Limitations of the SPI

The standardised precipitation index has the following main limitations (Lim):

  1. [(Lim 1), ref = Lim 0, nolistsep, leftmargin=*]

  2. The zero-augmented gamma distribution might not be a good fit for the precipitation data:

    Although in most practical applications the zero-augmented gamma distribution has been observed to be a good choice for precipitation data, there have been cases where it has been found to be inadequate (Guttman, 1999; Mishra and Singh, 2010). While it might be straightforward in theory to extend the standard SPI model to include other distributional choices for , it would nevertheless be useful if the methodology itself was more flexible in this regard.

  3. The time-scale is based on months: Theoretically there is no impediment to work with a time-scale other than months, but most published studies do not do this. Additionally, the official SPI user guide recommends working with a time-scale of at least 4 weeks (1 month) stating that lower values will make the SPI behave more erratically (World Metereological Organization, 2012). It would be desirable to develop an index that is flexible enough to allow the use of shorter and more arbitrary time-scales.

  4. It requires a long record of precipitation: In order to compute the SPI, it is recommended that at least 30 years of precipitation records are available, and ideally between 50 and 60 years (Piratheeparajah N and Raveendran S, 2014). The reason for this is the splitting of the complete moving average series into 12 independent subsets corresponding to each month of the year. Each of these twelve subsets has length equal to the number of years under study, therefore small values of may not provide reliable estimates of and . This problem is related to the fact that subsets of data are handled independently.

  5. It ignores the temporal correlation and the cyclic nature of , and hence in , (i.e. we would expect to be correlated with ):

    It is natural to observe a correlated and cyclic behaviour on precipitation data and the parameters associated with the density function; however, the SPI does not take this into account. This could affect parameter estimation because an outlier presented in certain month could drastically affect the estimated value of the parameters for that month only. This way the parameters will not vary smoothly across neighbouring months, which is both an undesirable property, but also affects the reliability of SPI values. In neglecting the temporal correlation inherent in time series such as precipitation, the SPI does not take advantage of the fact that time is a continuous variable and the the continuous sharing of information across time should improve parameter estimation and allow us to work with shorter time series (which is related to


3 Generalized Additive Models for Location, Scale and Shape

In this paper we suggest the use of generalized additive models for location, scale and shape (GAMLSS) to tackle the limitations of the SPI presented in Section 2.3. We briefly introduce this type of model in the present section.

A generalized additive model (GAM) is an extension of an generalized linear model (GLM) that allows for the inclusion of smooth functions of covariates into the linear predictor (Rigby and Stasinopoulos, 2005; Umlauf et al., 2017)

and thus they allow complex relationships between predictors and outcomes to be captured. The smooth functions are defined as linear combinations of basis functions, the most common being cubic regression splines, P-splines, thin plate regression splines and tensor product splines

(Wood, 2006).

A GAMLSS is an extension of a GAM where, in addition to the location parameter, the scale and shape parameter are also modeled with respect to covariates. More formally, assuming a response variable

with probability density function

, each parameter for is associated to a linear predictor through a monotonic link function such as


where represents the fixed effects associated to the covariates for an individual , and represent functions able to capture a wide variety of effects with corresponding parameters and covariates . Note that can not only represent smooth functions of a covariate, but also smooth functions of multiple covariates or varying effects. Considering as a smooth function, can be used to represent: a smooth effect , varying coefficient , a smooth multiple effect , a random intercept , a random slope , a spatial effect , a temporal effect , a space-time effect , and others such as seasonal effects (Umlauf et al., 2017).

More generally, for a set of observations

, parameter vector

and linear predictor vector , we can rewrite Equation 4 in matrix form as


such as represents the design matrix with fixed effects and is the design matrix required to construct the effect with parameters . The structure of will depend on the type of effects that are desired to be captured by as well as the type of covariates involved. Although Umlauf et al. (2017) allows to take more complex structures, the most common type of effects and the ones included in this study take the form .

Estimation usually proceeds using a penalised likelihood approach (Rigby and Stasinopoulos, 2005; Wood, 2006), or a Bayesian approach (Umlauf et al., 2017).

4 A Model-Based Method for Evaluating Extreme Hydro-Climatic Events

Having discussed some of the shortcomings of the SPI, in this section we propose two alternatives to the SPI: these will be model-based approaches which we refer to as the model-based standardised indices (MBSIs: MBSI-1 and MBSI-2); we argue that our indices retain the desirable characteristics of the SPI, but improve the methodology. Our model-based standardised indices are more stable, flexible and satisfying (from a modelling perspective) than the SPI as explained in Section 6.

Although there have been attempts to improve the methodology of the SPI (e.g. Erhardt and Czado (2017); World Meteorological Organization (WMO) and Global Water Partnership (GWP) (2016)) our method differs because we use a model-based approach, which accounts for the characteristics required to compute a standardized index. In contrast, Erhardt and Czado (2017)

proposed a group of steps to compute the SPI including; elimination of seasonality (including variable transformation to reduce skewness, computation of monthly sample and variance mean) and elimination of temporal dependence and transformation to the standard normal distribution. Our model-based approach allows us to not only compute the standardized precipitation index appropriately, but also enables us to work with short time series, check assumptions, work at different scales (e.g. weeks), work with missing values and provide further relevant information about the underlying process under study, such as precipitation.

In Section 4.1 and 4.2 respectively, we introduce the model-based standardised index (MBSI-1) and (MBSI-2), which address the limitations of the SPI discussed above using generalized additive models for location, scale and shape (GAMLSS). In addition, we discuss some limitations of using GAMLSS in Section 4.3.

4.1 Model-based Standardized Index 1 (MBSI-1)

In Section 2.1, we saw that the SPI is defined for the moving average process of a discrete stochastic process , where denoted the year and the month. The MBSI-1 instead uses the initial notation of Equation 1, i.e. we work directly with and as the precipitation and moving average process respectively. Note that we are assuming that and are on the same scale, which can be an arbitrary one such as daily, weekly, monthly, etc.

For the MBSI-1, we again define the density function of each element of the stochastic process as a mixture such as


where is an arbitrary value on the domain of , while and are the parameters associated with the mixture density at time .

The density function can be any distribution defined on the positive real numbers that is adequate for characterizing the moving average precipitation. In this paper, we complete the definition of by using a gamma density for with parameters , defined as follows


However, note that our approach is not limited to this distribution, and a different choice of may be more suitable in other situations. One consequence of assuming a gamma density is that the mean and variance of are and respectively.

As mentioned earlier, the SPI tries to quantify the extremity of levels of precipitation by comparing it with the usual seasonal behaviour. For this reason, our approach captures the seasonal behaviour in all the parameters by introducing models for , and , as in Equation 4, using linear predictors , and such as


where , and are (optional) design matrices that include information for predicting the process with linear effects , and ; and , and are the parameters required to define the flexible non-linear functions , and that capture the seasonal effects on , and respectively. A common choice for these functions in the generalised additive modelling literature is to represent them using cyclic cubic splines (Wood, 2006). An alternative to cyclic cubic splines is to use harmonic terms to represent seasonal effects. However, our experience of harmonic models in this context tends to overfit the data and using stepwise selection to reduce the number of harmonic terms can be a computationally slow process.

Our model, defined with Equations 6, 7 and 8 is a generalized additive model for location, scale and shape (GAMLSS), as explained in Section 3, using a zero-augmented gamma likelihood (ZAGA). Inference can be achieved using standard methods: backfitting or MCMC (Rigby and Stasinopoulos, 2005; Umlauf et al., 2017).

Another option for modelling serial dependence in the parameter vector would be to assume a latent, possibly multivariate, Gaussian process or a moving average process for , and . We have not explored these options, but they fit into the class of latent Gaussian models, for which there are a range of model fitting options, including INLA, MCMC and particle filtering, if not off-the-shelf software solutions to implement them.

Once we have estimated the parameters in our models, we can predict , and for any time and proceed with the computation of the MBSI-1 using steps 5 and 6 of Section 2.1. Hence, the computation of standardised precipitation values using MBSI-1 can be summarized with the following steps:

  1. [1), nolistsep]

  2. Define the time-scale to work with (e.g. 1 week, 4 weeks, 8 weeks, etc).

  3. Compute the -order moving average series using all the precipitation time series .

  4. Obtain the parameters estimates , , , , and by fitting the moving average series to the GAMLSS model with zero-augmented gamma distribution (Equations 6 and 7) and linear predictors defined in Equation 8.

  5. With the parameters estimated in the previous step (, , , , and ), obtain the estimates and , using Equation 8, for .

  6. Evaluate the cumulative density function of the observed values of the moving average process to obtain the collection .

  7. Obtain the values for the SPI by computing the quantiles of a standard normal distribution with probabilities .

4.2 Model-based Standardized Index 2 (MBSI-2)

One disadvantage of the MBSI-1 is that it requires to fit a model to the moving average process for every scale-time of interest . As an alternative to the MBSI-1, we propose a second approach under which the model fitting in only done once, for . We will refer to this approach as the model-based standardised index 2 (MBSI-2).

For this second approach, instead of imposing a model on the elements of the moving average process , we propose a model for the original stochastic process that represents the precipitation. Specifically, we assume that has a zero-augmented gamma distribution, which is defined by Equations 6 and 7, and the parameters are modelled considering a seasonal behaviour as in Equation 8. As a consequence, we can obtain the distribution of moving average for each . Unfortunately, we can not obtain the analytical expression of the resulting distribution of . However, we can use Monte Carlo methods to obtain the cumulative distribution function evaluated on the observed values of the moving average process , obtaining . Once these probabilities are obtained, we can finally obtain the quantiles of a standard normal distribution associated to these probabilities .

Hence, the computation of the MBSI-2 can be summarized as follows:

  1. [1), nolistsep]

  2. Obtain the parameters estimates , , , , and by fitting the original precipitation series to the GAMLSS model with zero-augmented gamma distribution (Equations 6 and 7) and linear predictors defined in Equation 8.

  3. With the parameters estimated in the previous step (, , , , and ), obtain the estimates and , using Equation 8, for .

  4. Obtain realizations , where , of the precipitation stochastic process using and for a zero-augmented gamma distribution (Equations 6 and 7).

  5. Define the time-scale to work with (e.g. 1 week, 4 weeks, 8 weeks, etc).

  6. Compute the -order moving average series of the precipitation time series and the -order moving average series of the samples .

  7. Evaluate the cumulative density function of the observed values of the moving average process to obtain the collection considering that

  8. Obtain the values for the SPI by computing the quantiles of a standard normal distribution with probabilities .

4.3 Limitations of GAMLSS

Although generalized additive models are attractive, they have some limitations that are worth exploring. Firstly, there can be a tendancy to overfitting the data, for example, it is known that the generalized cross-validation criterion used to estimate the smoothing parameter can lead to overfitting; however, this is less likely with a large number of observations and when the values across covariates are very well distributed (Wood, 2006). This problem can worsen when modelling in addition the scale and shape parameters because the model is much more flexible and appropriate precaution should be exercised on small sample sizes.

Another limitation is that prediction outside the range of values observed on the covariates might not be reliable because usually few observations with extreme values in the covariates are observed. Given that the model is very flexible, it will try to adapt to these values. In this way, prediction at the tails of the covariates may vary significantly from one sample to another, indicating that the model has high variance in the tails of covariates. Nevertheless, extrapolation is also problematic in other types of models.

Finally, the interpretability of GAM models is not as easy for GLM models and it is required to visualize the effects in order to understand and interpret them. Despite this, we view the visualization process as actually provide useful information on the effects at different levels. Also, when using credible intervals, insight into the significance of each term is obtained.

In conclusion, GAM and GAMLSS are attractive models, but they should be used with precaution given that their inherent flexibility.

5 Comparison Between the SPI and MBSI

In order to illustrate differences between the SPI, MBSI-1 and MBSI-2, we compare parameter estimation, model checking and the resulting standardized precipitation values for different time-scales using data collected between January - December (522 weeks) in Caapiranga, a road-less municipality in Amazonas State.

We use our R package mbsi, created to analyse and visualise extreme events, available from Github, It contains the implementation of the SPI, MBSI-1 and MBSI-2 indices used in this section.

5.1 Parameter Estimation

In this section we compare the estimated mean and coverage interval of the moving average rainfall obtained with the estimated parameters using both the SPI and the MBSI methodologies (Fig. 1). Given the density function defined in Equation (6) with Gamma density , the estimated mean for time is and the coverage interval for a time is obtained by computing the and quantiles of the estimated density function .

Figure 1: Precipitation moving average and coverage interval obtained by the SPI, MBSI-1 and MBSI-2 methodologies for different time-scales (, , and weeks)

We can see in Figure 1 that at a time-scale of 1 week, the mean and coverage interval change quickly for the classical SPI, whereas they change smoothly for the MBSIs. This is an indication that the SPI overfitted the observed precipitation data. Another characteristic of the SPI at this shorter time-scale is that parameter estimation is strongly affected by extreme short-term values. The coverage interval is highly influenced by these extreme values leading sometimes to much wider coverage intervals (e.g. due to some observations around 2005). This can reduce the ability of the SPI to detect extreme events, e.g. it can be seen in Figure 1 that there are more values lying outside the coverage intervals for the MBSIs. Both characteristics happen because parameters in the SPI are independent among months, while the MBSIs explicitly model this dependence using smooth functions.

As the time-scale increases, the difference between the estimated mean and coverage intervals methods decrease, but the coverage intervals are still wider and looser for the SPI.

5.2 Model Checking

Provided the assumed density function fits the data well, the probability integral transform implies we should expect the collection of the empirical cumulative density values, , to follow a standard uniform density. If this does not hold, then the interpretation as a standard normal distribution of the standardized values is misleading since the back-transformed data will not be normally distributed. By inspecting Figure 2, we can see that the uniform assumption holds for the three approaches for weeks, while for weeks, it seems that the MBSIs are more adequate. In general, there is no indication of drastic inadequacies for any of the methodologies.

Figure 2: Distribution of the empirical cumulative density function for the SPI, MBSI-1 and MBSI-2 methodologies for different time-scales (, , and

weeks). It should have a uniform distribution when the underlying model is adequate.

Figure 3: Comparison between the empirical quantiles (standardized precipitation values) and theoretical quantiles of a standard normal distribution for the SPI, MBSI-1 and MBSI-2 methodologies for different time-scales (, , and weeks). The points should be close to the identity line (straight line) to hold the assumption of normality.

If the uniformity assumption holds, then under the probability integral transform theorem, the obtained standardized precipitation values should follow a standard normal distribution, which can be checked by comparing the empirical quantiles with the theoretical quantiles of a standard normal distribution as shown in Figure 3. Although, we can see in Figure 3 that there are some small deviations from the identity line for the MBSI-1 and MBSI-2 at small scales, the points lie close to the identity line for the three methodologies and the four time-scales. Something to notice is that the SPI methodology tends to limit the standardized values between and for this data of observations, while we obtain more extreme standardized values with the MBSIs, something highlighted even more for the MBSI-2.

5.3 Standardized Precipitation Values

The general trend of the standardized precipitation values obtained by three methodologies are similar; however, the actual standardized values corresponding to the identified events, using Section 2.2, differs (Figure 4). For example, at the time scale of 1 week, most of the identified droughts have, clearly, greater absolute standardized values when working with the MBSIs. We can also see that the number of identified events varies between the methods. For instance, more droughts are identified with the MBSIs when selecting a threshold of for a time-scale of weeks. Another difference among the methods is that the MBSI-2 tends to intensify more the extreme events. For example, it can be seen that, for time-scales of and weeks, the levels of the standardized precipitation for and are more extremes for the MBSI-2 than the SPI and MBSI-1.

Figure 4: Standardized precipitation values and identification of extreme hydroclimatic events at different time-scales using the SPI and MBSI: the threshold to be considered extreme event was

Amazonas State experienced a well-documented major flood in 2009 and large-scale severe drought in 2010 (Chen et al., 2010; Lewis et al., 2011). The two events are highlighted at 8 and 12 weeks time-scales, but they are more emphasized when using the MBSI-1. For this reason and because it holds properties quite similar to SPI improving the methodology, we preferred to use the MBSI-1 for further studies on cities of the Brazilian Amazonia. However, we encourage the development of indices like the MBSI-2 where the model is imposed on the original process under study and analyse another process of interest (such as the moving average process) that depends on the original one, using theoretical properties derived from the initial model. This avoids the need to re-fit the model at different time-scales of potential interest.

6 Discussion and Conclusions

We compared the SPI with two proposed approaches MBSI-1 and MBSI-2 to obtain standardized precipitation values. It has been seen that the three approaches are adequate in terms of model assumptions; however, we found some differences that leaded as to select the MBSI-1 to be used in our studies conducted in the Brazilian Amazonia. Our results clearly demonstrate that the methodology of the SPI can be adapted and placed in a modelling framework that can resolve some of the disadvantages of this index.

  • Because we use the GAMLSS framework, several distributions can be easily applied to compute standardized precipitation values and the diagnostic of the GAMLSS framework can be used to test model adequacy. Alternatively, it is suggested to evaluate the adequacy of the method by checking the property of the probability integral transform.

  • The definition of time-scale is more general in the MBSI-1 and so with this model, it is not necessary to work on the monthly scale. In addition, the observed series of precipitation data (or other quantity of interest e.g. river levels) could have missing values or it might be observed at irregular intervals.

  • By borrowing strength from temporal autocorrelation and seasonal patterns, the MBSI-1 can compute standardized precipitation values using a shorter length of records, i.e. less then 30 years, while the SPI usually requires a longer series or a wider time-scale to avoid overfitting.

  • The MBSI-1 is a temporally continuous model for precipitation and as such, parameters in the model change more naturally (i.e. smoothly) over time. In addition, the MBSI-1 could be extended to evaluate extreme events, assume trends over the time, or to incorporate spatial effects.


  • Ban et al. (2017) Ban, H. J., Kwon, Y. J., Shin, H., Ryu, H. S., and Hong, S. (2017). Flood monitoring using satellite-based RGB composite imagery and refractive index retrieval in visible and near-infrared bands. Remote Sensing, 9(4).
  • Blanka et al. (2017) Blanka, V., Ladányi, Z., Szilassi, P., Sipos, G., Rácz, A., and Szatmári, J. (2017). Public Perception on Hydro-Climatic Extremes and Water Management Related to Environmental Exposure, SE Hungary. Water Resources Management, 31(5):1619–1634.
  • Chacón-Montalván et al. (2018) Chacón-Montalván, E. A., Parry, L., Torres, P., Orellana, J., Davies, G., and Taylor, B. M. (2018). Evaluating the Effects of Extreme Hydro-climatic Events on Birth-weight in Amazonia.
  • Chen et al. (2010) Chen, J. L., Wilson, C. R., and Tapley, B. D. (2010). The 2009 exceptional Amazon flood and interannual terrestrial water storage change observed by GRACE. Water Resources Research, 46(12):1–10.
  • Du et al. (2013) Du, J., Fang, J., Xu, W., and Shi, P. (2013). Analysis of dry/wet conditions using the standardized precipitation index and its potential usefulness for drought/flood monitoring in Hunan Province, China. Stochastic Environmental Research and Risk Assessment, 27(2):377–387.
  • Erhardt and Czado (2017) Erhardt, T. M. and Czado, C. (2017). Standardized drought indices: a novel univariate and multivariate approach. Journal of the Royal Statistical Society: Series C (Applied Statistics).
  • Filizola et al. (2014) Filizola, N., Latrubesse, E. M., Fraizy, P., Souza, R., Guimarães, V., and Guyot, J. L. (2014). Was the 2009 flood the most hazardous or the largest ever recorded in the Amazon? Geomorphology, 215:99–105.
  • Guerreiro et al. (2008) Guerreiro, M. J., Lajinha, T., and Abreu, I. (2008). Flood Analysis with the Standardized Precipitation Index (SPI). Revista da Faculdade de Ciênca e Tecnologia. Porto, 4:8–14.
  • Guttman (1998) Guttman, N. B. (1998). Comparing the Palmer drought index and the standardized precipitation index. Journal Of The American Water Resources Association, 34(1):113–121.
  • Guttman (1999) Guttman, N. B. (1999). Accepting the Standardized Precipitation Index: a Calculation Algorithm1. JAWRA Journal of the American Water Resources Association, 35(2):311–322.
  • Hayes et al. (2011) Hayes, M., Svoboda, M., Wall, N., and Widhalm, M. (2011). The Lincoln Declaration on Drought Indices: Universal Meteorological Drought Index Recommended. Bulletin of the American Meteorological Society, 92(4):485–488.
  • Hayes et al. (1999) Hayes, M. J., Svoboda, M. D., Wilhite, D. A., and Vanyarkho, O. V. (1999). Monitoring the 1996 Drought Using the Standardized Precipitation Index. Bulletin of the American Meteorological Society, 80(3):429–438.
  • Houghton et al. (2001) Houghton, J. T., Y, D., DJ, G., M, N., PJ, v. d. L., X, D., K, M., and C, J. (2001). Climate Change 2001: The Scientific Basis. Climate Change 2001: The Scientific Basis, 57(8):881.
  • Keoduangsine and Goodwin (2012) Keoduangsine, S. and Goodwin, R. (2012). An Appropriate Flood Warning System in the Context of Developing Countries. International Journal of Innovation, Management and Technology, 3(3):213.
  • Koriche and Rientjes (2016) Koriche, S. A. and Rientjes, T. H. M. (2016). Application of satellite products and hydrological modelling for flood early warning. Physics and Chemistry of the Earth, 93:12–23.
  • Lehner et al. (2006) Lehner, B., Döll, P., Alcamo, J., Henrichs, T., and Kaspar, F. (2006). Estimating the impact of global change on flood and drought risks in Europe: A continental, integrated analysis. Climatic Change, 75(3):273–299.
  • Lewis et al. (2011) Lewis, S. L., Brando, P. M., Phillips, O. L., van der Heijden, G. M. F., and Nepstad, D. (2011). The 2010 Amazon Drought. Science, 331(6017):554–554.
  • Lloyd-Hughes and Saunders (2002) Lloyd-Hughes, B. and Saunders, M. A. (2002). A drought climatology for Europe. International Journal of Climatology, 22(13):1571–1592.
  • Mckee et al. (1993) Mckee, T. B., Doesken, N. J., and Kleist, J. (1993). The relationship of drought frequency and duration to time scales. AMS 8th Conference on Applied Climatology, (January):179–184.
  • McMichael (2013) McMichael, A. J. (2013). Globalization, Climate Change, and Human Health. New England Journal of Medicine, 368(14):1335–1343.
  • Mishra and Singh (2010) Mishra, A. K. and Singh, V. P. (2010). A review of drought concepts. Journal of Hydrology, 391(1-2):202–216.
  • Morid et al. (2006) Morid, S., Smakhtin, V., and Moghaddasi, M. (2006). Comparison of seven meteorological indices for drought monitoring in Iran. International Journal of Climatology, 26(7):971–985.
  • Piratheeparajah N and Raveendran S (2014) Piratheeparajah N and Raveendran S (2014). Spatial Variations of the Flood and Drought in the Northern Region of Sri Lanka. International Research Journal of Earth Sciences, 2(6):1–10.
  • Rigby and Stasinopoulos (2005) Rigby, R. A. and Stasinopoulos, D. M. (2005). Generalized additive models for location, scale and shape (with discussion). Journal of the Royal Statistical Society: Series C (Applied Statistics), 54(3):507–554.
  • Rosenzweig et al. (2001) Rosenzweig, C., Iglesias, A., Yang, X. B., Epstein, P. R., and Chivian, E. (2001). Climate change and extreme weather events. Global change & human health, 2(2):90–104.
  • Seiler et al. (2002) Seiler, R. A., Hayes, M., and Bressan, L. (2002). Using the standardized precipitation index for flood risk monitoring. International Journal of Climatology, 22(11):1365–1376.
  • Shelton (2009) Shelton, M. (2009). Hydroclimatology: perspectives and applications. Cambridge University Press.
  • Stephenson (2008) Stephenson, D. B. (2008). Definition, diagnosis and origin of extreme weather and climate events. Climate Extremes and Society, page 340.
  • Umlauf et al. (2017) Umlauf, N., Klein, N., and Zeileis, A. (2017). BAMLSS : Bayesian additive models for location , scale and shape ( and beyond ). University of Innsbruck Working Papers in Economics and Statistics.
  • Wang et al. (2017) Wang, Y., Chen, X., Chen, Y., Liu, M., and Gao, L. (2017). Flood/drought event identification using an effective indicator based on the correlations between multiple time scales of the Standardized Precipitation Index and river discharge. Theoretical and Applied Climatology, 128(1-2):159–168.
  • Watts et al. (2015) Watts, N., Adger, W. N., Agnolucci, P., Blackstock, J., Byass, P., Cai, W., Chaytor, S., Colbourn, T., Collins, M., Cooper, A., Cox, P. M., Depledge, J., Drummond, P., Ekins, P., Galaz, V., Grace, D., Graham, H., Grubb, M., Haines, A., Hamilton, I., Hunter, A., Jiang, X., Li, M., Kelman, I., Liang, L., Lott, M., Lowe, R., Luo, Y., Mace, G., Maslin, M., Nilsson, M., Oreszczyn, T., Pye, S., Quinn, T., Svensdotter, M., Venevsky, S., Warner, K., Xu, B., Yang, J., Yin, Y., Yu, C., Zhang, Q., Gong, P., Montgomery, H., and Costello, A. (2015). Health and climate change: Policy responses to protect public health. The Lancet, 386(10006):1861–1914.
  • Wood (2006) Wood, S. S. (2006). Generalized Additive Models: An Introduction with R. CRC Press.
  • World Meteorological Organization (WMO) and Global Water Partnership (GWP) (2016) World Meteorological Organization (WMO) and Global Water Partnership (GWP) (2016). Handbook of drought indicators and indices. Geneva., IDMP.
  • World Metereological Organization (2012) World Metereological Organization (2012). Standardized Precipitation Index User Guide.
  • Zeng et al. (2008) Zeng, N., Yoon, J.-h., Marengo, J. A., Subramaniam, A., Nobre, C. A., Mariotti, A., and Neelin, J. D. (2008). Causes and impacts of the 2005 Amazon drought. Environmental Research Letters, 3(1):014002.