0.1 Motivation and Data Example
Surveillance data is nowadays collected in vast amounts to support the analysis and control of infectious diseases. As a consequence, the data volume, velocity and variety exceeds the resources to look at each report individually and thus statistical summaries, insightful visualizations and automation are needed (Höhle, 2017). In response to this, semiautomatic systems for the screening and further investigation of outbreaks have been developed. This screening typically consists of identifying emerging spikes in the monitored data streams and flagging particular cases for further inspection. For foodborne diseases, for example, this inspection could consist of attempts to identify and remove the food source consumed by all cases.
Our aim is thus to develop data mining tools to support the sequential decision making problem (to react or not) for pathogens with quite heterogeneous features, differing in e.g., data collection, prevalence and public health importance. As an example, even a single Ebola case constitutes a serious public health threat, whereas only larger clusters of Campylobacter jejuni infections will trigger public health actions. It is also worth to point out that large accumulations of cases in a short time period are almost surely noticed at the local level. Automatic procedures nonetheless provide a safety net ensuring that nothing important is missed, but the added value lies predominantly in the detection of dispersed crossregional outbreaks. Another added value is that the quantitative nature of the algorithms allows for a more objective approach, which can be a helpful addition to epidemiological intuition. However, alarms are useless if they are too frequent to be investigated properly. On the other hand, missing an important outbreak is also fatal.
0.1.1 Invasive Meningococcal Surveillance in Germany
As an illustration we use 20022008 data from the routine monitoring of invasive meningococcal disease (IMD) by the National Reference Centre for Meningococci (NRZM) in Germany^{1}^{1}1http://www.meningococcus.de as motivating example. Figure 1 shows the number of new cases of two particular finetypes described in more detail in Meyer et al. (2012) and available as dataset imdepi in the R package surveillance (Salmon et al., 2016b; Meyer et al., 2017).
From the figure and its context we observe a number features which make the statistical modelling and monitoring of surveillance time series a challenge. From the time series in the figure we see that we deal with a low integer number of reported cases, which can contain secular trends, seasonality, as well as past outbreaks. In the IMD example a microbiological investigation provides a pretty clear definition, however, having a clear case definition can otherwise be a challenge. Furthermore, it is common to only report positive test outcomes with no information on how many tests were actually performed. Spikes in the incidence of new cases might very well be due to an increased testing behaviour. This brings us to the important issues of underascertainment and underreporting often reflected as stages in a surveillance pyramid of which only the top stage are the reported cases. Another issue, which we shall ignore in this chapter, is the problem of reporting delays; depending on the aims it can be necessary to adjust the detection for this phenomena, c.f. the chapter written by Angela Noufaily. Instead, we move on to a statistical description and solution of the detection problem.
0.2 Univariate Surveillance Methods
Assume data have been preprocessed such that a univariate time series of counts is available. Denote by the time series of cases with the time index representing, e.g., daily, weekly or monthly time periods and being the observation under present consideration. Different statistical approaches can now be used to detect an outbreak at the last instance of the series. Such approaches have a long tradition and comprehensive review articles exist (Farrington and Andrews, 2003; Unkel et al., 2012; Sonesson and Bock, 2003; Le Strat, 2005; Woodall, 2006; Höhle and Mazick, 2011). An implementational description of many of these algorithms can be found in the R package surveillance (Salmon et al., 2016b). We therefore keep this account short and just present two of the most commonly used algorithms, in particular because their approach fits well into that of the multivariate approaches covered in Section 0.3. Both algorithms presented compute a prediction for based on a set of historic values, under the assumption that there is no outbreak at , and then assess how extreme the actually observed value is under this assumption.
0.2.1 EARS Algorithm
The Early Aberration Detection System (EARS) method of the CDC as described in Fricker et al. (2008) is a simple algorithm convenient in situations when little historic information is available and, hence, trends and seasonality are not of concern. In its simplest form, the baseline is formed by the last timepoints before the assessed timepoint , which is particularly meaningful when monitoring a time series of daily data. The simplest version of the method is based on the statistic , where
are the usual unbiased estimators for the mean and variance of the historic values. For the null hypothesis of no outbreak, it is assumed that
. The threshold for an extreme observation, under the assumption of no outbreak, is now defined as . Consequently, an alarm is raised if the current observation exceeds this upper limit.The simplicity of the method makes it attractive, and variants of this Gaussian approach can be found in different contexts, e.g., as an approximation when the underlying distribution is binomial (Andersson et al., 2014). However, from a statistical point of view the method has a number of shortcomings. In particular the distributional
assumption is likely to be inaccurate in case of counts below 510, because the distribution is discrete, nonnegative and hence rightskewed. As easy as the three times standard deviation is to remember, the appropriate comparison of
should be with the upper limit of a prediction interval. Assuming that the’s are identical and independent variables from a Gaussian distribution with both mean and standard deviation unknown, such a onesided
interval has upper limit(1) 
where denotes the quantile of the tdistribution with degrees of freedom. Note that for
larger than 2030, this quantile is very close to that of the standard normal distribution, which is why some accounts, e.g.
Farrington and Andrews (2003), use instead. However, when the corresponding to a multiplication factor of 3 is not , as one might think, but , wheredenotes the cumulative distribution function of the tdistribution with
degrees of freedom. In other words, using a factor of 3 means that the probability of false discoveries is notably higher than one might naively expect. From an epidemiologist’s point of view, this might seem as yet another instance of statistical nitpicking, however, in the next section we provide an example illustrating how misaligned the
rule can be in the case of few and low count data.The more historic values are available, the larger one can choose in practice. However, seasonality and secular trends then become an issue. A simple approach to handle seasonality suggested by Stroup et al. (1989)
is to just pick time points similar to the currently monitored time point. For example: When working with weekly data and looking at week 45 in year 2017 then one could take the values of, say, weeks 43 to 47 of the previous years as historic values. Another approach is to handle seasonality and trends explicitly in a linear regression model framework, e.g.,
where is the period, e.g., 52 for weekly data. This has the advantage that all historic values are used to infer what is expected. As an alternative to the above superposition of harmonics one could instead use (penalized) cyclic splines (Wood, 2006) or factor levels for the individual months or days.
0.2.2 Farrington Algorithm
An extension of the above approaches is the so called Farrington algorithm (Farrington et al., 1996; Noufaily et al., 2013), which explicitly uses an underlying count data distribution and handles possible trends through the use of an (overdispersed) Poisson regression framework. The algorithm is particularly popular at European public health institutions as its easy to operate and handles a large spectrum of time series with different characteristics without the need of particular tuning.
Assuming, as before, that one wants to predict the number of cases at time under the assumption of no outbreak by using a set of historic values from a window of size up to periods back in time. With weekly data and assuming 52 weeks in every year the set of historic values would be . We then fit an overdispersed Poisson generalized linear model (GLM) with and loglinear predictor
to the historic values. In the above, denotes the time points of the historic values and is the overdispersion parameter. If the dispersion parameter in the quasiPoisson is estimated to be smaller than one a Poisson model (i.e. ) is used instead. Based on the estimated GLM model we compute the upper of limit of a onesided prediction interval for by
(2) 
where , because the current observation is not used as part of the estimation. We know that asymptotically , where denotes the inverse of the observed Fisher information. Therefore, is normally distributed with variance . Through the use of the delta method we find that .
Figure 2 displays the fitted GLM and the limits of a corresponding twosided prediction interval for the first observation in 2008 when , and . Because the upper limit 21.1 of the prediction interval is larger than the observed value of 13 no alarm is sounded.
Note that the above is a somewhat simplified description of the Farrington procedure. For example, a trend is only included, if the following three conditions are all fulfilled: it is significant at the 0.05 level, there are at least 3 years of historic values and including it does not lead to an overextrapolation (i.e. is smaller than the maximum of the historic values). Additional refinements of the algorithm include the computation of the prediction interval in (2) on a or power transformation scale and then backtransform the result. This would for example fix the obvious problem in (2
) that the prediction includes negative values: using the 2/3power transformation the resulting upper limit would be 24.5 instead. If the historic values contain outliers in form of previous outbreaks the Farrington algorithm suggests to instead base the prediction on a reweighted second fit with weights based on Anscombe residuals.
Noufaily et al. (2013) suggest a number of additional improvements to the algorithm. Instead of using the window based approach to select the historic values, all past data are utilized in the regression model by adding a cyclic 11knot zeroorder spline consisting of a 7week historic period and nine 5week periods. Furthermore, better results seem to be obtained when computing the prediction interval directly on a negative binomial assumption, i.e. by assuming where the first argument denotes the mean and the second argument the dispersion parameter of the distribution. By plugin of the estimates and one obtains that thequantile of the negative binomial distribution in the example of Fig.
2 is 24.For comparison: had we taken the 7 observations corresponding to Farrington’s and as historic values for the EARS approach we get , , and an upper limit of 15.0. When using (1) with for the same historic values the upper limit is 20.8. The corresponding upper limit of the Farrington procedure are 15.7 (untransformed), 17.2 (2/3power transformation) and 16 (Poisson quantile) for the QuasiPoisson approach. If we assume that a is the correct nulldistribution (the output of the Farrington algorithm in the example), the 3 times standard deviation rule thus produces too many false alarms—in this particular setting about 0.681% instead of the nominal 0.135%. Continuing this further: If the 7 observations would instead have been a quarter of their value (i.e. 2x1 and 5x2), the corresponding false alarm probability of the approach would raise as high as 9.53%, whereas the Farrington procedure with quantile threshold by construction keeps the nominal level. This illustrates the importance of using count response distributions when counts are small.
0.3 Multivariate Surveillance Methods
Surveillance of univariate data streams typically involves some type of spatial or temporal aggregation. For example, many of the detection algorithms discussed in the previous section assume the disease cases are counted on a daily or weekly basis, or monitor just the total sum of all cases that occur in a municipality, a county, or even a whole nation. If this aggregation is too coarse, it could result in the loss of information useful for the detection of emerging threats to public health. Similarly, the failure to monitor secondary data sources such as overthecounter medicine sales may equate to a forfeiture of the same. These problems motivate the use of multivariate methods, which offer means to treat data at a finer spatial or temporal scale, or to include more sources of information. We start this section by reviewing some of the available extensions of univariate methods to multivariate settings, and then proceed to cover methods for cluster detection in spatiotemporal data. Naturally, our review cannot be allencompassing, and we therefore direct the reader to the books by Lawson and Kleinman (2005), Wagner et al. (2006), and Rogerson and Yamada (2008) for other takes on surveillance methods for spatiotemporal data.
0.3.1 Extensions of Univariate Surveillance Methods
An intuitive approach to the surveillance of multiple data streams is to apply one of the univariate methods from Section 0.2 to each time series monitored. For example, Höhle et al. (2009) applied the Farrington method to the monthly incidence of rabies among foxes in the 26 districts of the German state of Hesse, and did likewise to the aggregated cases for the state as a whole and for its subdivision into three administrative units. Data from the years 1990–1997 were available as a baseline, and the period 1998–2005 was used for evaluation. The authors were able to to pinpoint the districts in which an outbreak occurred in March of 2000. Höhle et al. (2009) argue that using multiple univariate detectors in this hierarchical way is often a pragmatic choice, because many of the analogous multivariate changepoint detection methods used in the literature (see e.g. Rogerson and Yamada, 2004) assume continuous distributions for the data; an assumption hardly realistic for the low count time series often seen after partitioning the total number of cases by region, age, serotype, and so on.
The parallel application of univariate methods does have its downsides, however. Univariate methods such as the Farrington algorithm require either a false alarm probability (significance level) or a threshold to be set prior to an analysis. These are often set to achieve some maximum number of false alarms per month or year (see e.g. Frisén, 2003, for other optimality criteria). If the same conventional is used for detection methods run in parallel, in the absence of an outbreak, the probability of raising at least one false alarm will be much greater than . On the other hand, lowering will make outbreaks harder to detect. Multivariate methods, considered next, do not suffer from the same issues.
0.3.1.1 Scalar Reduction and Vector Accumulation
In the multivariate setting, we suppose that the process under surveillance can be represented as a
variate vector
, where are the time points under consideration. Each component could represent the disease incidence (as a count) of a given region at time , for example. One of the earliest control chart methods of multivariate surveillance is the use of Hotelling’s statistic (Hotelling, 1947). Under the null hypothesis for this method, is assumed to follow a multivariate normal distribution with mean vector and covariance matrix , both typically estimated from the data. Hotelling’s method then reduces the multivariate observation at each timepoint to a scalar statistic, given by(3) 
The Hotelling statistic is thus the squared Mahalanobis distance, which measures the distance between the observed data and the null hypothesis distribution while accounting for the different scales and correlations of the monitored variables. When properly scaled, the statistic has an distribution under the null hypothesis of no outbreak; hence a detection threshold is given by a suitable quantile from this distribution. When dealing with disease count data, however, we know beforehand that the s should increase in case of an outbreak. This prior knowledge is not reflected in the statistic, which penalizes deviations from the mean in either direction. With this motivation, O’Brien (1984) proposed several parametric and nonparametric tests that accomodate alternative hypotheses for consistent departures from the null hypothesis.
A problem with Hotelling statistic, and likewise the methods proposed by O’Brien (1984), is that they will fail to accumulate evidence for an outbreak over time. As noted by Sonesson and Frisén (2005) (in regard to the statistic), this will render the methods ineffective at detecting small to moderate changes in the monitored process. A solution suggested by Crosier (1988) is to first calculate the Hotelling statistic for each timepoint in the surveillance period, take its square root , and then apply a CUSUM (Page, 1954) scheme , where and are chosen together with a threshold to achieve a certain false positive rate, for example. Similarly, Rogerson (1997) devised a CUSUMized version of Tango’s (1995)
retrospective spatial statistic, which assumes a Poisson distribution for the data, and calculates a quadratic form statistic using a distance matrix.
The parallel surveillance and scalar reduction methods can be combined by first calculating the ordinary alarm statistics for each data stream, and then apply a scalar reduction method to the vector of such statistics. A few such approaches are reviewed in Sonesson and Frisén (2005). We also point the reader to the important review papers by Frisén (1992), Sonesson and Bock (2003), Frisén et al. (2010), and Unkel et al. (2012). Finally, for a more general introduction to statistical quality control and the CUSUM method in particular, we recommend the book written by Montgomery (2008).
0.3.1.2 Illustration of Hotelling’s statistic
We now calculate the Hotelling statistic for the (monthly) Meningococcal data introduced in Section 0.1. Because this method assumes that the data follows a multivariate normal distribution, it is at an immediate disadvantage when applied to case data with low counts. We therefore aggregate this data, first over time to form monthly counts for all 413 districts of Germany, and then over space to obtain the corresponding time series for each of the country’s 16 states, i.e. . For the purposes of illustration, rather than epidemiological correctness, we also combine the cases across the two different finetypes B (MenB) and C (MenC). In Figure 3, we show the calculated statistics and the critical values at significance level corresponding to an of 3 years. The years 2002–2003 were used as a baseline period for the estimation of the mean vector and covariance matrix using the standard sample formulas (see e.g. Rencher and Christensen, 2012), and these parameter estimates were updated based on all available data at each time step. Note that it may be advisable to use robust estimators of the mean vector and covariance matrix in practice; we chose the standard estimators here because parameter estimation is not the focus of this illustration. Reinhardt et al. (2008) analyzed the meningococcal disease data for the years 2004–2005 using the EpiScanGIS (National Reference Centre for Meningococci, 2006) software and found one cluster in the period. For comparison, we therefore run our analysis in the same time interval. Figure 3 shows the monthly time series of the statistic (solid line), along with a critical value (dashed line).
As can be seen in Figure 3, Hotelling’s method fails to detect any outbreak at the chosen significance level (here , corresponding to one expected false detection every 3 years). An apparent flaw is that the detection threshold, which is based on an distribution with one of the degrees of freedom parameters increasing with , will decrease monotonically with time, eventually reaching its asymptote. This happens regardless of significance level, and it may therefore be useful to explore other ways of obtaining detection thresholds—simulationbased, perhaps. Another drawback, which is not exclusive to Hotelling’s method, is that even if a detection is made, the method does not inform us of where (or which variables) caused the threshold to be exceeded. Methods that remedy this problem are the topic of the next section.
0.3.2 Spatiotemporal Cluster Detection
For some diseases, it may suffice to monitor the univariate time series of cases aggregated across a country, using the methods described in Section 0.2. A detection signal may then prompt further investigation into the time, location and cause of the potential outbreak. For many diseases however, a large increase in the case count of a small region can be drowned out by the noise in the counts of other regions as they are combined. In order to detect emerging outbreaks of such diseases before they grow large, the level of aggregation needs to be smaller, perhaps even at the level of individual cases—a tradeoff between variability and detecting a pattern. Despite the possibilities of longrange transmission enabled by modern means of transportation, transmission at a local level is still the dominant way in which most infectious diseases spread (Höhle, 2016). Thus, the statistical methods used to detect the clusters that arise should take into account the spatial proximity of cases in addition to their proximity in time. We begin this section by describing methods for areareferenced discretetime data, in which counts are aggregated by region and time period. This is followed by Section 0.3.2.2, discussing methods for continuous space, continuous time data.
0.3.2.1 Methods for AreaReferenced Data
Aggregation by region and time period may in some cases be required by public health authorities due to the way the data collection or reporting works, or for privacy reasons. Legislation and established data formats may thus determine at what granularity surveillance data is available. For example, the CASE surveillance system (Cakici et al., 2010) used by the Public Health Agency of Sweden is limited to data at county (län) level, with the least populous of the 21 counties having a population of about 58,000 people. The monitored data in the areareferenced setting is typically of the form , where denotes the number of (spatial) regions, such as counties, and the index denotes time intervals of equal length. Cluster detection in this setting involves identifying a set of regions in which a disease outbreak is believed to be emerging during the most recent time periods (weeks, for example). We will denote such a spacetime window by , and its complement by . A wellestablished methodology for cluster detection for this task is that of scan statistics, which dates back to the 1960’s with work done by Naus (1965), and that took on its modern form after the seminal papers by Kulldorff and Nagarwalla (1995) and Kulldorff (1997). These methods, which were extended from the spatial to the spatiotemporal context by Kulldorff (2001), are widely used amongst public health departments thanks to the free software SaTScan™ (Kulldorff, 2016).
0.3.2.1.1 Example: Kulldorff’s prospective scan statistic
To illustrate the typical procedure of using a scan statistic designed for spacetime cluster detection, we work through the steps of calculating and applying of the most popular such methods, often simply referred to as Kulldorff’s (prospective) scan statistic (Kulldorff, 2001). This method assumes that the count in region and time follows a Poisson distribution with mean . Here, is an ‘expected count’ or ‘baseline’, proportional to the population at risk in region at time . For example, these could be constrained such that the sum of the expected counts over all regions and time periods equals the total of the observed counts during the time period under study. That is, . The factor , often called the relative risk, is assumed to be the same for all and provided there is no outbreak. This constitutes the null hypothesis. In the case of an outbreak however, it is assumed that the relative risk is higher inside a spacetime window , consisting of a subset of regions and stretching over the most recent time periods. That is, for and we have , while for or it holds that with . Here, is the complement of . This multiplicative increase in the baseline parameters inside a spacetime window is typically how outbreaks are modelled for scan statistics. For scan statistics applied to prospective surveillance, it is important to note that all potential spacetime clusters have a temporal duration that stretches from the most recent time period backwards, without interuptions. This means that no inactive outbreaks are considered, and that any spacetime window included in an analysis can be thought of as a cylinder with a base formed by the perimiter of the geographical regions covered by the window, and a height equal to its length in time.
0.3.2.1.2 Definition and calculation of the scan statistic
Of course, there could be many spacetime windows for which the alternative hypothesis is true. If one would conduct hypothesis tests for each window separately—and the number of such windows could well be in the thousands or hundreds of thousands in typical applications—this would result in a very large number of false positives for standard significance levels such as 0.05. This problem could be counteracted by lowering the nominal significance level to a miniscule value, or by using some other repeated testing strategy, but this would in turn allow only very large outbreaks (in terms of relative to ) to be captured. The solution proposed by Kulldorff (2001) is to focus only on the window that stands out compared to the others, as measured by the size of likelihood ratio statistics calculated for all windows of interest. By calculating the maximum of all such statistics, and using the distribution of this maximum to calculate values, the ‘most anomalous’ spacetime cluster can be identified. To calculate a likelihood ratio for a given spacetime window , one must first calculate the maximum likelihood estimates of the relative risks and . For Kulldorff’s Poisson scan statistic, these are easily computed as
(4) 
where
(5) 
Thus, the likelihood ratio statistic conditional on the window is then given by
(6) 
up to a multiplicative constant not dependent on or . Here, is the indicator function. This statistic is then calculated for all spacetime windows of interest, and the scan statistic is defined as the maximum of all such statistics: . The corresponding window , often called the most likely cluster (MLC), is thus identified.
0.3.2.1.3 Hypothesis testing
Because the distribution of cannot be derived analytically, a Monte Carlo approach to hypothesis testing is often taken, whereby new data for each region and time is simulated under the null hypothesis using the expected counts . For Kulldorff’s scan statistic, the sampling is made conditional on the total observed count , leading to a multinomial distribution over regions and time intervals for the new counts. This sampling is repeated times, and for each sample , a new scan statistic is computed. A Monte Carlo value for the observed scan statistic can then be calculated in standard fashion using its rank amongst the simulated values:
Typically, a number such as or is used in order to get a fixed number of digits for the value. For prospective surveillance, past analyses—and potentially future ones—should also be accounted for when conducting hypothesis tests, in order to avoid a greater number of expected false positives than implied by the nominal significance value (i.e. a multiple testing problem). The solution suggested by Kulldorff (2001) is to expand the set of replicates above by including replicates calculated in past analyses. If too many past analyses are included however, the hypothesis tests could become too conservative. Kulldorff (2001) therefore recommends including only the most recent analyses, where could be chosen to achieve a certain false positive rate during a given monitoring period, for example. This practice has been the subject of a heated debate recently, with Correa et al. (2015a) asserting that any nominal significance level used for prospective surveillance with Kulldorff’s scan statistic is unrelated to the average run length (ARL) and recurrence interval (RI) which are commonly used in prospective surveillance. Similar points have been raised earlier by Woodall (2006), Joner et al. (2008), and Han et al. (2010). In a rebuttal, Kulldorff and Kleinman (2015) explain that in one of the three prospective cases considered by Correa et al. (2015a), the simulations performed actually show the expected result, and that in the other two cases the concerns raised are actually misunderstandings. Correa et al. (2015b) later clarify that failure to account for future analyses remain a concern, despite the comments by Kulldorff and Kleinman (2015). In the latest reply to the debate, Tango (2016) states that both of the previous parties are in fact wrong: Correa et al. (2015a) for being unrealistic in their consideration of an indefinite number of future analyses and for focusing on the ARL in a setting for which the spatial component of outbreaks is at least as important as the temporal one (because the ARL does not inform us of the spatial spreading of the disease), and Kulldorff (2001) and Kulldorff and Kleinman (2015) for performing a prospective analysis conditional on the total number of observed cases in any given study period. Rather, Tango (2016) reiterates the point made by Tango et al. (2011) that such an analysis should be unconditional on the total count. This last point will be expanded upon below, but the overall implication for those wishing to apply scan statistics in prospective settings is to carefully weigh the benefits and costs of setting the threshold for “sounding the alarm” at a particular level.
Given that the number of spacetime windows to be included in the analysis can range in the thousands or hundreds of thousands, the calculation of Monte Carlo values means an fold increase of an already high computational cost. One way to reduce this computational cost is to calculate a smaller number of Monte Carlo replicates of the scan statistic, fit an appropriate distribution to these replicates, and then compute the tail probability of the observed scan statistic from the fitted distribution. Abrams et al. (2010) tested such a procedure for a number of different distributions (Gumbel, gamma, lognormal, normal) on Kulldorff’s scan statistic and others. The authors found that the Gumbel distribution yielded approximate values that were highly accurate in the far tails of the scan statistic distribution, in some scenarios making it possible to achieve the same rejection power with one tenth as many Monte Carlo replicates. Another possibility is to circumvent simulation altogether by comparing the value of the scan statistic computed on the current data to values calculated in the past, provided no outbreaks are believed have been ongoing in the data used for these past calculations. Neill (2009a) compared this approach to Monte Carlo simulation with standard and Gumbel values, and found that the latter two methods for calculating values gave misleading results on three medical data sets, requiring a much lower significance level than originally posited to reach an acceptable level of false positives.
0.3.2.1.4 Cluster construction
A second way of limiting the computational cost of running a scan statistic analysis is to limit the search to clusters with a compact spatial component (zone) , in the sense that all regions inside the cluster are close to one another geographically. This makes sense for both computational and practical reasons, since many of the subsets of all regions are spatially disconnected and therefore not of interest for detection of diseases that emerge locally. For instance, one could limit the search to the nearest neighbors to each region , for , where is some userdefined upper bound (as in Tango et al., 2011), or automatically chosen such that the largest zone with region as center encompasses regions with a combined population that equals approximately 50% of the total population in the whole study area (as in Kulldorff, 2001). The set of zones obtained by these methods will be quite compact, which can be problematic if the true outbreak zone consists of regions that all lie along a riverbank, for example. To capture a richer set of zones, Tango and Takahashi (2005) propose a way to construct ‘flexibly shaped’ zones, by considering all connected subsets of the nearest neighbors of each region , that still contain region itself. This method can yield a vastly larger set of zones, but becomes impractical in terms of runtime for when the number of regions is about 200 or more.
More datadriven approaches to finding the set of interesting zones can also be taken. For example, Duczmal and Assunção (2004) consider the set of all connected subgraphs as the set of allowable zones, and search the most promising clusters using a simulated annealing approach. Here, the nodes in the graph searched are the spatial regions, and edges exists between regions that share a common border. Another more holistic approach is taken by Neill et al. (2013), who propose a framework in which regions are first sorted by priority based on observed counts and estimated parameters, and the set of zones scanned then taken to be only the top regions, for . This method is discussed further in Section 0.3.2.1.7.
0.3.2.1.5 Illustration of Kulldorff’s scan statistic
Once the set of clusters to be scanned have been determined, the analysis can take place. Here, we apply Kulldorff’s prospective scan statistic (Kulldorff, 2001) to the Meningococcal data considered earlier, now aggregated to monthly counts for each of Germany’s 413 districts (kreise). This scan statistic is implemented in the R package scanstatistics (Allévius, 2017) as the function scan_pb_poisson.
In Figure 4, we show the resulting scan statistics for each month of the study period (2004–2005). At each time step, the statistic was calculated using at most the latest 6 months of data, and the for each district and time point was estimated as
(7) 
Here, is the total observed count over all districts and time points, is the length of the study period, and and are the 2008 populations for district and all of Germany, respectively. Critical values (sample quantiles) for the significance level were obtained from Monte Carlo replication with replicates, and previously generated replicates were included in the calculation at each new time step.
The most likely cluster detected by Kulldorff’s prospective scan statistic, for most months scanned, corresponds well to the region of highest disease incidence in the years 2004–2005. The core cluster seems to be four districts in North RhineWestphalia, one of them the city (urban district) Aachen, and coincides with a confirmed cluster of Meningococcal disease discussed in Meyer et al. (2012). This cluster also matches that found by Reinhardt et al. (2008). In Figure 5, we show a map of the counties of North RhineWestphalia, with the detected cluster shaded in gray.
Figure 4 shows that the given significance level results in detection signals during much of the study period. This may not be entirely plausible, and one could see this as an inadequacy of the Monte Carlo method, as discussed in Section 0.3.2.1.3 (see in particular the issues of debate mentioned there). More likely however, is that the Poisson assumption of Kulldorff’s prospective scan statistic is illsuited for data with such an abundance of zeros as the meningococcal data, considering that it assumes a Poisson distribution for the counts. For this type of data, a scan statistic based on e.g. the zeroinflated Poisson distribution (see Cançado et al., 2014; Allévius and Höhle, 2017) may perform better.
0.3.2.1.6 Developments in spacetime scan statistics
Kulldorff’s 2001 prospective scan statistic was introduced above to present the general procedure of using a scan statistic to detect spacetime disease clusters. Of course, many more such methods have been devised since 2001, many to deal with the purported “flaws” of Kulldorff’s statistic. One such drawback of Kulldorff’s prospective scan statistic is that it requires data on the population at risk and its spatial (and possibly temporal) distribution over the study area. The notion of a population at risk may not be applicable in cases where the surveillance data consists of counts of emergency department visists and overthecounter medicine sales, for example. Noting this fact, Kulldorff et al. (2005) formulate a spacetime permutation scan statistic, which uses only the observed counts in the current study period and area to estimate the baselines. Under the assumption that the probability of a case occurring in region , given that it occurred at time , is the same for all times , the baseline (expected value) for the count in region at time is estimated by Kulldorff et al. (2005) as
(8) 
The analysis is thus conditional on the marginal counts over regions and time points, leading to a hypergeometric distribution for the total count inside a spacetime cluster
. This distribution can in turn be approximated by a Poisson distribution when the marginal counts inside the cluster are small compared to the overall total count . Thus, a likelihood ratio statistic can be calculated using Equation (6), leading to a scan statistic of the same form as Kulldorff (2001). For hypothesis testing however, the random sampling is done such that the marginal counts over regions and time points are preserved, rather than just the total count.The prospective (Kulldorff, 2001) and spacetime permutation (Kulldorff et al., 2005) scan statistics both conduct hypothesis testing using expected counts that are conditional on the total or marginal counts in the observed data. This has been met with some critique, particularly from Tango et al. (2011) who give a simple (albeit extreme) example showing that if counts are increased uniformly in all regions under surveillance, these types of scan statistics will fail to detect that something out of the ordinary has happened. Using less extreme circumstances, Neill (2009a) demonstrate that a ‘conditional’ scan statistic of this sort has low power to detect outbreaks that affect a large share of the regions under surveillance. When an outbreak actually is detected in such a scenario, that scan statistic takes longer to do so than the expectationbased scan statistics used for comparison. Introduced for Poissondistributed counts by Neill et al. (2005) and Neill (2006), these scan statistics do not condition on the observed total count in their analysis. Neither do they use the most recent data for baseline parameter estimation; rather they calculate the baselines and other nonoutbreak parameters based on what we can expect to see from past data. In that sense, the analysis is split into two independent parts: First, parameters of the distribution are estimated on historical data believed to contain no outbreaks. This can be done by any method regression preferable; typically by fitting a GLM (McCullagh and Nelder, 1989)
or a moving average to the data. Second, the estimated parameters are used for simulating new counts from the given probability distribution, and plugged into the calculation of the scan statistic on both the observed and simulated data in order to conduct Monte Carlo hypothesis testing. Expectationbased scan statistics have been formulated for the Poisson
(Neill et al., 2005; Neill, 2009b), Gaussian (Neill, 2006), negative binomial (Tango et al., 2011), and zeroinflated Poisson (Allévius and Höhle, 2017) distributions, among others.The drawback of these scan statistics, as well as those mentioned at the beginning of this section, is that they scan over a fixed set of spacetime windows . If the true outbreak cluster is not among the windows scanned, the outbreak cannot be exactly identified. As also mentioned, the number of windows to be scanned poses a computational burden, particularly if Monte Carlo hypothesis is to be conducted. To overcome these issues, Neill (2012) proposes a ‘Fast Subset Scan’ framework for making fast searches for the topscoring cluster, unconstrained by any predefined set of windows to be scanned. In particular, Neill introduces a ‘Linear Time Subset Scanning’ (LTSS) property, which is shown to hold for several members of the exponential family of distributions (or rather, scan statistics based thereon). For scan statistics with this property, a score function (such as the likelihood ratio in Equation (6)) is paired with a priority function, defined e.g. as the ratio of of count to baseline. The latter function is used to sort the data in order of priority, after which the former function only needs to be applied to increasing subsets of the ordered records. This allows an unconstrained search for the topscoring subset in linear time (plus the cost of sorting the data records); the method can also be modified to search only for clusters fulfilling spatial constraints, such as subsets of the nearest neighbors of each region. The Fast Subset Scan framework can also be extended to multivariate spacetime data, as covered next.
0.3.2.1.7 Multivariate scan statistics
Until now, the scan statistics described have been univariate in the sense that the counts typically consist of cases of a single disease or symptoms thereof, each count having both spatial and temporal attributes. In practice however, public health authorities monitor a multitude of such data streams simultaneously. If each analysis is done without regard for the others, this will undoubtedly yield a number of false positives higher than desirable. Alternatively, if the significance level of each analysis is adjusted for the others, the power to detect an outbreak in any of the data streams diminishes. These issues can become by analyzing all data streams jointly, an endeavour that can be particularly fruitful when the streams pertain to the same phenomenon to be detected; typically a disease outbreak. For example, some diseases have multiple symptoms and patients may seek help in different ways. A simultaneous increase in overthecounter flu medication sales at pharmacies and respiratory symptoms reported at a local clinic may in such a case be indicative of an influenza outbreak, but this signal of an outbreak may be missed if each of the two data streams are considered individually.
With this motivation and inspired by an idea of Burkom (2003), Kulldorff et al. (2007) formulate a multivariate scan statistic based on Kulldorff’s prospective scan statistic (Kulldorff, 2001). The data in this setting can be represented as a collection of vector counts , where are the counts for each of the data streams monitored. The scan statistic is calculated by first processing each data stream separately, calculating a likelihood ratio statistic using Equation (6) for each spacetime window , just as is done with the univariate scan statistic. For those windows whose aggregated count exceeds the aggregated baseline, the logarithm of these likelihood ratios are added to form a statistic for the window as a whole. The scan statistic is then defined as the maximum of all such statistics, so that the most likely cluster can be identified as the regions, time intervals and data streams making a positive contribution to the maximum statistic.
Building upon the Fast Subset Scan framework (Neill, 2012) cited earlier, Neill et al. (2013) present two computationally efficient ways to detect clusters in spacetime data, with and without spatial constraints on these clusters. The first method, Subset Aggregation, is an extension of the work done by Burkom (2003). It assumes that if an outbreak occurs, it has a multiplicative effect on the baselines that is equal across all regions, time points and data streams affected by the outbreak. That is, for those regions, times and streams affected by the outbreak, the expected value of the count is rather than . This allows the counts and baselines within each subset (cluster) to be aggregated, so that the Subset Aggregation scan statistic can be reduced to a univariate scan statistic for the cluster. The second method is the one previously proposed by Kulldorff et al. (2007), in which an outbreak affects each data stream separately through a streamspecific multiplicative factor . Neill et al. (2013) then demonstrates how these methods can be combined with the Fast Subset Scan framework for scan statistics satsifying the LTSS property (Neill, 2012), yielding fast, exact detection algorithms when either the number of data streams or the number of regions are small, and fast approximate (randomized) algorithms when there are too many regions and streams for all subsets of each to be scanned. If hypothesis testing is to take place, this can be done using Monte Carlo replication as described earlier. Again, such replication can come at a high computational cost, and some efforts have therefore been made to avoid it altogether.
0.3.2.1.8 Bayesian scan statistics
Neill et al. (2006) introduce the Bayesian Spatial scan statistic for cluster detection, based on Kulldorff’s 1997 original scan statistic. The method is easily extended to a spatiotemporal setting, which is that described below. In Kulldorff’s 2001 model, the data are assumed to be Poisson distributed with expected values , the relative risk varying depending on whether an outbreak is ongoing or not, and estimated by maximum likelihood. In the model of Neill et al. (2006), the parameters
are instead given conjugate gamma distribution priors, with prior probabilities tuned to match the occurrence of an outbreak in each possible outbreak cluster considered. With the conjugate prior for the relative risks, and baselines
estimated from historical data, simple analytical formulas can be derived for the marginal probabilities of the data (relative risks integrated out), in the end resulting in the posterior probability of an outbreak for each cluster considered, and for the nonoccurrence of an outbreak. Thus, no Monte Carlo replications need to be made.
To examplify the Bayesian scan statistic (Neill et al., 2006)
, suppose the null hypothesis of no outbreak states that each count is distributed as a Poisson random variable with mean
, where is fixed and . After marginalizing over the distribution of , the likelihood under the null hypothesis becomes the negative binomial (a.k.a. gammaPoisson mixture) probability mass function:(9) 
where represents the entire data set, is the sum of all counts, and the sum of all baselines . The alternative hypothesis states that an outbreak is occuring in a spacetime window , with a prior distribution placed over all potential windows . For a given , it is assumed that counts inside have a relative risk distributed as , while those counts outside have a corresponding distribution for with parameters and . After marginalizing over the relative risk distributions, the likelihood becomes
(10) 
where and is the sum of counts and baselines inside , respectively, , and . With a prior placed on the null hypothesis, and similarly for each outbreak scenario, one obtains the posterior probabilities
(11)  
(12) 
where . Neill et al. (2006) gives advice for eliciting the priors and
, and also for specification of the hyperparameters of each relative risk distribution.
The spacetime extension of the Bayesian spatial scan statistic (Neill et al., 2006) is available as the function scan_bayes_negbin in the scanstatistics R package (Allévius, 2017). To illustrate, we run this scan statistic on the same data as in the illustration of Kulldorff’s scan statistic above. As hyperparameters, we set to the estimate in Equation (8), and let all gamma distribution parameters and be equal to 1. The exception is , which we assume to be the same for all . We give this parameter a discrete uniform prior on equally spaced values between 1 and 15 in the first month, and let subsequent months use the posterior distribution from the previous month as a prior. We also set , which is on the order of magnitude of the incidence of meningococcal disease in Germany in 2002–2008. In Figure 6, we show the posterior outbreak probability at each month of the analysis, for the spacetime window which maximizes this probability.
Figure 6 shows clear similarities to the scores calculated for Kulldorff’s scan statistic, shown in Figure 4. Further, half of all most likely clusters reported by each method are the same. The difference, as discussed earlier, is that the output of the Bayesian scan statistic is a posterior probability rather than the maximum of a likelihood ratio
The Bayesian scan statistic was later extended to a multivariate setting by Neill and Cooper (2010). This Multivariate Bayesian Scan Statistic (MBSS) is also capable of detecting multiple event types, thus allowing it, for example, to assign probabilities to outbreaks of different diseases based on records of symptom data. A downside of this method is that the set of clusters to be searched must be specified, and the prior probability of an outbreak in each is (typically) uniform over all clusters, regardless of size and shape. To remedy this defect, Neill (2011) modify the MBSS by specifying a hierarchical spatial prior over all subsets of regions to be scanned. This method is shown to be superior to the MBSS in spatial accuracy and detection timeliness, yet remaining computationally efficient. More recently, for univariate data, a Baysian scan statistic based on the zeroinflated Poisson distribution has been proposed by Cançado et al. (2017).
0.3.2.1.9 Alternative methods for cluster detection in areareferenced data
Scan statistics are not the only option for cluster detection of the sort discussed above. To give an example, there is the What’s Strange About Recent Events? (WSARE) (Wong et al., 2005)
method, available as software, which can detect outbreaks in multivariate spacetime data sets by using a rulebased technique to compare the observed data against a baseline distribution. WSARE uses Bayesian networks, association rules and a number of other techniques to produce a robust detection algorithm. Other examples with more detail than space allows for here can be found e.g. in
Rogerson and Yamada (2008).0.3.2.2 Methods for Point Process Data
The methods discussed in the previous section were applicable to data which has been aggregated over space and time. While such an accumulation may be the inherent form of the data, the loss of granularity could impede both the timeliness and spatial accuracy with which outbreaks are detected. When data with exact coordinates and time stamps are available, it may thus be beneficial to analyze data in this format. In this section we therefore assume that the available data is of the form , where are the coordinates (longitude and latitude) of the event (disease case, typically), and the time of occurrence. We assume an ordering , and that the study region is defined by the study area and the surveillance timer interval .
One starting point to analyze such data is to adapt a purely temporal surveillance method to a spatiotemporal setting. Assunção and Correa (2009) does so by combining a nonhomogeneous Poisson point process partially observed on with the ShirayevRoberts (SR) statistic (Shirayev, 1963; Roberts, 1966; Kennet and Pollak, 1996), utilizing the martingale property of the latter to establish a protocol for achieving a desired ARL with minimal parameter input by the user. Aside from the ARL, the user needs to specify a radius defining the maximum spatial extent of the outbreak cluster, as well as a parameter which measures the relative change in the density within the outbreak cluster as compared to the nonoutbreak situation. The SR statistic in Assunção and Correa’s (2009) formulation is then defined as
(13)  
(14) 
Here, is the cylinder defined by the ball of radius centered on the location of the th event and the time interval between the th and the th event, is the number of events inside that cylinder, is the expected number of events inside the cylinder if there is no spacetime clustering, and is an indicator taking the value 1 if the th event is inside and 0 otherwise. The outbreak signal goes off if is larger than the specified ARL, and the outbreak cluster is identified as the cylinder for which is maximized. The quantity requires the specification of two densities related to the process’ intensity function, but since this is too much to ask from the user, is instead replaced by the estimate
(15) 
i.e. the product of the number of events within the disk , regardless of time of occurrence, and the number of events that occurred in the time interval anywhere within the whole study region, divided by the total number of events. Every new incoming event thus requires the recalculation of all terms in Equation (13), which Assunção and Correa (2009) demonstrate can be done in an efficient iterative procedure.
Assunção and Correa’s (2009) method was later extended by Veloso et al. (2013) to handle the detection of multiple spacetime clusters. This is accomplished by (randomly) deleting excess events inside the detected clusters and rerunning the method with a new ARL threshold corrected for the event deletion.
There have also been attempts to adapt retrospective methods for detection of spacetime point event clusters to a prospective setting. For example, Rogerson (2001) formulate a local version of the (retrospective) Knox statistic (Knox, 1964) and combine it with a CUSUM framework to make the method suitable for prospective surveillance. However, Marshall et al. (2007) later concluded that it is certainly not, owing to a number of factors that strongly influence the performance of the method, and which a user has little power to regulate properly. A remedy is offered by Piroutek et al. (2014), who redefine the local Knox statistic by Rogerson (2001) to be prospective rather than retrospective. For a given observation indexed by , the local Knox statistic is defined as the number of observations that are closer than units of time to , and whose coordinates are less than a distance away from . In Rogerson (2001), the closeness in the temporal dimension is measured in both directions, so that a counts nearby events both before and after the th event. This account of future events is included in the CUSUM chart, which is not appropriate. Piroutek et al. (2014) instead let only past events enter into the calculation of , which turns out to vastly improve the performance of the method in a prospective setting. As noted by Marshall et al. (2007), there are also problems with the normal distribution approximation made to the statistic calculated for the CUSUM chart. Namely, it can be very poor if the thresholds and are small, and this in turn makes it difficult to set the appropriate control limit to achieve a given ARL. Piroutek et al. (2014) instead propose to use a randomization procedure using past data to establish the correct control limit, obviating the need of an approximate distribution.
Paiva et al. (2015) later combines the modified local Knox statistic by Piroutek et al. (2014) with the Cumulative Surface method proposed by Simões and Assunção (2005), allowing for a visualization of clusters in three dimensions. In their method, the local Knox score of each event is smeared using a bivariate Gaussian kernel function, and a threshold for detection is defined through the distribution of the stochastic surfaces formed. In all, the method requires few parameters as input from users, and needs no information of the population at risk.
0.3.2.2.1 Illustration of Assunção and Correa’s SR statistic
We now illustrate the method by Assunção and Correa (2009) on the meningococcal data (finetype B only) considered earlier, this time using the coordinates and time stamps of each event, rather than an aggregated count. The method is implemented as the function stcd in the R package surveillance (Salmon et al., 2016). As a validation of the method, we run the analysis for approximately the same period as Reinhardt et al. (2008), hoping that the results will be similar. We guide our choice of the parameter required by this method by the analysis of the meningococcal data by Meyer et al. (2012); Figure 3 of this paper suggests that setting km is adequate. For the choice of the parameter , which is the relative change of eventintensity within the tobedetected cluster, we are guided by the simulation study by Assunção and Correa (2009) and set . Lastly, we set the desired ARL to 30 days.
In Figure 7, the detected cluster is shown, with all events up until the point of detection marked as triangles on the map.
The cluster detected by the stcd function is centered on the city of Aachen in the state North RhineWestphalia, which corresponds well to the cluster marked in Figure 3 of Reinhardt et al. (2008), and the dates also appear similar. The cluster was detected only one day after its estimated start date, showing the potential of Assunção and Correa’s method in terms of timeliness. All in all, the ability to use the spatial and temporal attributes of each individual event for the detection of clusters is an attractive feature when greater exactness in the origins and spatial extent of outbreaks is important.
0.4 Summary and Outlook
This chapter presented temporal and spatiotemporal statistical methods for the prospective detection of outbreaks in routine surveillance data. Such datadriven algorithms operate at the intersection of statistical methodology, data mining and computational (big) data crunching. Simple methods have their virtues and speed of implementation can be of importance; however, this should never be an excuse for ignoring statistical facts when counts get small and sparse. Facts which might surprise the nonstatisticians uncomfortable beyond the normal distribution.
Despite the many advances in the last 30 years, outbreak detection
algorithms will never replace traditional epidemiological
alertness. Nevertheless, they offer support for using the digital
epidemiologist’s time more effectively. From a systems perspective,
however, it is our impression that detection algorithms have had
limited impact on practical outbreak detection so far. One reason is
that, historically, many algorithmic suggestions were not supported by
accessible software implementations. Another reason is that their
their usefulness is questioned by the many false alarms and a
misalignment between the users’ needs and presentation of the alarms
found by the system. Statisticians—as part of interdisciplinary
teams—need to worry more about how to rollout the proposed
algorithms in the public health organizations. As a step in this
direction, all detection methods presented in this chapter are readily
available from opensource packages in R. In particular, the
surveillance and scanstatistics packages^{2}^{2}2The R code
producing the results and graphics of this manuscript is available from
https://github.com/BenjaK/ProspectiveDetectionChapter. Examples of
software supporting national surveillance systems by increased user
focus are the Swedish CASE system (Cakici et al., 2010) and the
German avoid@RKI (Salmon et al., 2016a).
Acknowledgements
Benjamin Allévius was supported by grant 2013:05204 by the Swedish Research Council. Michael Höhle was supported by grant 201505182_VR by the Swedish Research Council.
The authors thank Renato Assunção and Leonhard Held for their insightful feedback during the writing of the chapter.
References
 Abrams et al. (2010) Abrams, A. M., Kleinman, K., and Kulldorff, M. (2010). Gumbel based pvalue approximations for spatial scan statistics. International Journal of Health Geographics, 9(61).
 Allévius and Höhle (2017) Allévius, B. and Höhle, M. (2017). An expectationbased spacetime scan statistic for ZIPdistributed data. Technical report, Department of Mathematics, Stockholm University.

Allévius (2017)
Allévius, B. (2017).
scanstatistics: SpaceTime Anomaly Detection using Scan Statistics
. R package version 1.0.0.  Andersson et al. (2014) Andersson, T., Bjelkmar, P., Hulth, A., Lindh, J., Stenmark, S., and Widerstrom, M. (2014). Syndromic surveillance for local outbreak detection and awareness: evaluating outbreak signals of acute gastroenteritis in telephone triage, webbased queries and overthecounter pharmacy sales. Epidemiol. Infect., 142(2):303–313.
 Assunção and Correa (2009) Assunção, R. M. and Correa, T. (2009). Surveillance to detect emerging spacetime clusters. Computational Statistics and Data Analysis, 53(8):2817–2830.
 Burkom (2003) Burkom, H. S. (2003). Biosurveillance applying scan statistics with multiple, disparate data sources. Journal of Urban Health : Bulletin of the New York Academy of Medicine, 80(2, Supplement 1):i57–i65.
 Cakici et al. (2010) Cakici, B., Hebing, K., Grünewald, M., Saretok, P., and Hulth, A. (2010). Case: a framework for computer supported outbreak detection. BMC Medical Informatics and Decision Making, 10(14).
 Cançado et al. (2014) Cançado, A. L. F., DaSilva, C. Q., and da Silva, M. F. (2014). A spatial scan statistic for zeroinflated Poisson process. Environmental and Ecological Statistics, 21(4):627–650.
 Cançado et al. (2017) Cançado, A. L. F., Fernandes, L. B., and DaSilva, C. Q. (2017). A Bayesian spatial scan statistic for zeroinflated count data. Spatial Statistics, 20:57–75.
 Correa et al. (2015a) Correa, T., Assunção, R. M., and Costa, M. A. (2015a). A critical look at prospective surveillance using a scan statistic. Statistics in Medicine, 34(7):1081–1093.
 Correa et al. (2015b) Correa, T., Assunção, R. M., and Costa, M. A. (2015b). Response to commentary on ’A critical look at prospective surveillance using a scan statistic’. Statistics in Medicine, 34(7):1096.
 Crosier (1988) Crosier, R. B. (1988). American Society for Quality Multivariate Generalizations of Cumulative Sum QualityControl Schemes Multivariate Generalizations of Cumulative Sum QualityControl Schemes. Technometrics, 30(3):291–303.
 Duczmal and Assunção (2004) Duczmal, L. H. and Assunção, R. M. (2004). A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters. Computational Statistics and Data Analysis, 45(2):269–286.
 Farrington and Andrews (2003) Farrington, C. and Andrews, N. (2003). Outbreak detection: Application to infectious disease surveillance. In Brookmeyer, R. and Stroup, D., editors, Monitoring the Health of Populations, chapter 8, pages 203–231. Oxford University Press.
 Farrington et al. (1996) Farrington, C., Andrews, N., Beale, A., and Catchpole, M. (1996). A statistical algorithm for the early detection of outbreaks of infectious disease. Journal of the Royal Statistical Society, Series A, 159:547–563.
 Fricker et al. (2008) Fricker, R. D., Hegler, B. L., and Dunfee, D. A. (2008). Comparing syndromic surveillance detection methods: Ears’ versus a cusumbased methodology. Statistics in Medicine, 27(17):3407–3429.
 Frisén (1992) Frisén, M. (1992). Evaluations of methods for statistical surveillance. Statistics in Medicine, 11(11):1489–1502.
 Frisén (2003) Frisén, M. (2003). Statistical Surveillance. Optimality and Methods. International Statistical Review, 71(2):403–434.
 Frisén et al. (2010) Frisén, M., Andersson, E., and Schiöler, L. (2010). Evaluation of multivariate surveillance. Journal of Applied Statistics, 37(12):2089–2100.
 Han et al. (2010) Han, S. W., Tsui, K. L., Ariyajuny, B., and Kim, S. B. (2010). A comparison of CUSUM, EWMA, and temporal scan statistics for detection of increases in poisson rates. Quality and Reliability Engineering International, 26(3):279–289.
 Höhle (2016) Höhle, M. (2016). Infectious Disease Modelling. In Lawson, A. B., Banerjee, S., Haining, R. P., and Ugarte, M. D., editors, Handbook of Spatial Epidemiology, chapter 26, pages 467–490. Chapman and Hall/CRC, 1 edition.
 Höhle and Mazick (2011) Höhle, M. and Mazick, A. (2011). Aberration detection in R illustrated by Danish mortality monitoring. In KassHout, T. and Zhang, X., editors, Biosurveillance: A Health Protection Priority, pages 215–238. CRC Press.
 Höhle et al. (2009) Höhle, M., Paul, M., and Held, L. (2009). Statistical approaches to the monitoring and surveillance of infectious diseases for veterinary public health. Preventive Veterinary Medicine, 91(1):2–10.
 Hotelling (1947) Hotelling, H. (1947). Multivariate quality control, illustrated by the air testing of sample bombsights. In Techniques of Statistical Analysis, chapter 3, pages 111–184. McGrawHill, 1 edition.
 Höhle (2017) Höhle, M. (2017). A statistician’s perspective on digital epidemiology. Life Sciences, Society and Policy. In revision.
 Joner et al. (2008) Joner, M. D., Woodall, W. H., and Reynolds, M. R. (2008). Detecting a rate increase using a Bernoulli scan statistic. Statistics in Medicine, 27(14):2555–2575.
 Kennet and Pollak (1996) Kennet, R. S. and Pollak, M. (1996). Dataanalytic aspects of the shiryayev–roberts control chart: Surveillance of a nonhomogeneos poisson process. Journal of Applied Statistics, 1(23):125–138.
 Knox (1964) Knox, E. G. (1964). The Detection of SpaceTime Interactions. Applied Statistics, 13(1):25.
 Kulldorff (1997) Kulldorff, M. (1997). A spatial scan statistic. Communications in Statistics  Theory and Methods, 26(6):1481–1496.
 Kulldorff (2001) Kulldorff, M. (2001). Prospective time periodic geographical disease surveillance using a scan statistic. Journal of the Royal Statistical Society Series aStatistics in Society, 164:61–72.
 Kulldorff (2016) Kulldorff, M. (2016). Satscan™. https://www.satscan.org.
 Kulldorff et al. (2005) Kulldorff, M., Heffernan, R., Hartman, J., Assunção, R. M., and Mostashari, F. (2005). A spacetime permutation scan statistic for disease outbreak detection. PLoS Medicine, 2(3):0216–0224.
 Kulldorff and Kleinman (2015) Kulldorff, M. and Kleinman, K. (2015). Comments on ’A critical look at prospective surveillance using a scan statistic’ by T. Correa, M. Costa, and R. Assunção. Statistics in Medicine, 34(7):1094–1095.
 Kulldorff et al. (2007) Kulldorff, M., Mostashari, F., Duczmal, L. H., Yih, K., Kleinman, K., and Platt, R. (2007). Multivariate scan statistics for disease surveillance. Statistics in Medicine, 26(8):1824–1833.
 Kulldorff and Nagarwalla (1995) Kulldorff, M. and Nagarwalla, N. (1995). Spatial disease clusters: Detection and inference. Statistics in Medicine, 14(8):799–810.
 Lawson and Kleinman (2005) Lawson, A. B. and Kleinman, K. (2005). Spatial and Syndromic Surveillance for Public Health. Wiley.
 Le Strat (2005) Le Strat, Y. (2005). Overview of temporal surveillance. In Lawson, A. and Kleinman, K., editors, Spatial and syndromic surveillance for public health, chapter 3, pages 31–52. Wiley.
 Marshall et al. (2007) Marshall, B. J., Spitzner, D. J., and Woodall, W. H. (2007). Use of the local Knox statistic for the prospective monitoring of disease occurrences in space and time. Statistics in Medicine, 26(7):1579–1593.
 McCullagh and Nelder (1989) McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. Chapman and Hall/CRC, 2 edition.
 Meyer et al. (2012) Meyer, S., Elias, J., and Höhle, M. (2012). A SpaceTime Conditional Intensity Model for Invasive Meningococcal Disease Occurrence. Biometrics, 68(2):607–616.
 Meyer et al. (2017) Meyer, S., Held, L., and Höhle, M. (2017). Spatiotemporal analysis of epidemic phenomena using the R package surveillance. Also available as vignettes of the R package surveillance.
 Montgomery (2008) Montgomery, D. C. (2008). Introduction to Statistical Quality Control. Wiley, 6 edition.
 National Reference Centre for Meningococci (2006) National Reference Centre for Meningococci (2006). Episcangis.
 Naus (1965) Naus, J. I. (1965). The Distribution of the Size of the Maximum Cluster of Points on a Line. Journal of the American Statistical Association, 60(310):532–538.
 Neill (2006) Neill, D. B. (2006). Detection of spatial and spatiotemporal clusters. Phd thesis, Carnegie Mellon University.
 Neill (2009a) Neill, D. B. (2009a). An empirical comparison of spatial scan statistics for outbreak detection. International Journal of Health Geographics, 8(20).
 Neill (2009b) Neill, D. B. (2009b). Expectationbased scan statistics for monitoring spatial time series data. International Journal of Forecasting, 25(3):498–517.
 Neill (2011) Neill, D. B. (2011). Fast Bayesian scan statistics for multivariate event detection and visualization. Statistics in Medicine, 30(5):455–469.
 Neill (2012) Neill, D. B. (2012). Fast subset scan for spatial pattern detection. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 74(2):337–360.
 Neill and Cooper (2010) Neill, D. B. and Cooper, G. F. (2010). A multivariate Bayesian scan statistic for early event detection and characterization. Machine Learning, 79(3):261–282.
 Neill et al. (2013) Neill, D. B., McFowland III, E., and Zheng, H. (2013). Fast subset scan for multivariate event detection. Statistics in Medicine, 32(13):2185–2208.
 Neill et al. (2006) Neill, D. B., Moore, A. W., and Cooper, G. F. (2006). A Bayesian Spatial Scan Statistic. Advances in Neural Information Processing Systems, 18:1003.
 Neill et al. (2005) Neill, D. B., Moore, A. W., Sabhnani, M., and Daniel, K. (2005). Detection of emerging spacetime clusters. In Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining  KDD ’05, page 218.
 Noufaily et al. (2013) Noufaily, A., Enki, D. G., Farrington, P., Garthwait, P., Andrews, N., and Charlett, A. (2013). An improved algorithm for outbreak detection in multiple surveillance systems. Statistics in Medicine, 32(7):1206–1222.
 O’Brien (1984) O’Brien, P. C. (1984). Procedures for Comparing Samples with Multiple Endpoints. Biometrics, 40(2):1079–1087.
 Page (1954) Page, E. S. (1954). Procedures for Comparing Samples with Multiple Endpoints. Biometrica, 41(1/2):100–115.
 Paiva et al. (2015) Paiva, T., Assunção, R. M., and Simões, T. (2015). Prospective spacetime surveillance with cumulative surfaces for geographical identification of the emerging cluster. Computational Statistics, 30(2):419–440.
 Piroutek et al. (2014) Piroutek, A., Assunção, R. M., and Paiva, T. (2014). Spacetime prospective surveillance based on Knox local statistics. Statistics in Medicine, 33(16):2758–2773.
 Reinhardt et al. (2008) Reinhardt, M., Elias, J., Albert, J., Frosch, M., Harmsen, D., and Vogel, U. (2008). EpiScanGIS: an online geographic surveillance system for meningococcal disease. International journal of health geographics, 7:33.

Rencher and Christensen (2012)
Rencher, A. C. and Christensen, W. F. (2012).
Methods of Multivariate Analysis
. Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, USA.  Roberts (1966) Roberts, S. W. (1966). On the detection of disorder in a manufacturing process. Technometrics, 3(8):411–430.
 Rogerson and Yamada (2008) Rogerson, P. and Yamada, I. (2008). Statistical Detection and Surveillance of Geographic Clusters. Chapman and Hall/CRC.
 Rogerson (1997) Rogerson, P. A. (1997). Surveillance systems for monitoring the development of spatial patterns. Statistics in Medicine, 16(18):2081–2093.
 Rogerson (2001) Rogerson, P. A. (2001). Monitoring point patterns for the development of spacetime clusters. Journal of the Royal Statistical Society: Series A (Statistics in Society), 164(1):87–96.
 Rogerson and Yamada (2004) Rogerson, P. A. and Yamada, I. (2004). Monitoring change in spatial patterns of disease: Comparing univariate and multivariate cumulative sum approaches. Statistics in Medicine, 23(14):2195–2214.
 Salmon et al. (2016a) Salmon, M., Schumacher, D., Burman, H., Frank, C., Claus, H., and Höhle, M. (2016a). A system for automated outbreak detection of communicable diseases in Germany. Eurosurveillance, 21(13):pii=3018.
 Salmon et al. (2016b) Salmon, M., Schumacher, D., and Höhle, M. (2016b). Monitoring count time series in R: Aberration detection in public health surveillance. Journal of Statistical Software, 70(10). Also available as vignette of the R package surveillance.
 Salmon et al. (2016) Salmon, M., Schumacher, D., and Höhle, M. (2016). Monitoring count time series in R: Aberration detection in public health surveillance. Journal of Statistical Software, 70(1):1–35.
 Shirayev (1963) Shirayev, A. N. (1963). On the detection of disorder in a manufacturing process. Theory of Probability and its Applications, 3(8):247–265.
 Simões and Assunção (2005) Simões, T. and Assunção, R. M. (2005). Geoinfo 2005  vii simpósio brasileiro de geoinformática. In Sistema de vigilância para detecção de interações espaçotempo de eventos pontuais, pages 281–291.
 Sonesson and Bock (2003) Sonesson, C. and Bock, D. (2003). A review and discussion of prospective statistical surveillance in public health. Journal of the Royal Statistical Society. Series A: Statistics in Society, 166(1):5–21.
 Sonesson and Frisén (2005) Sonesson, C. and Frisén, M. (2005). Multivariate Surveillance. In Spatial and Syndromic Surveillance for Public Health, volume 3, pages 153–166. John Wiley & Sons, Ltd.
 Stroup et al. (1989) Stroup, D., Williamson, G., Herndon, J., and Karon, J. (1989). Detection of aberrations in the occurrence of notifiable diseases surveillance data. Statistics in Medicine, 8:323–329.
 Tango (1995) Tango, T. (1995). A class of tests for detecting ‘general’ and ‘focused’ clustering of rare diseases. Statistics in Medicine, 14(2122):2323–2334.
 Tango (2016) Tango, T. (2016). On the recent debate on the spacetime scan statistic for prospective surveillance. Statistics in Medicine, 35(11):1927–1928.
 Tango and Takahashi (2005) Tango, T. and Takahashi, K. (2005). A flexibly shaped spatial scan statistic for detecting clusters. International Journal of Health Geographics, 4(11):15.
 Tango et al. (2011) Tango, T., Takahashi, K., and Kohriyama, K. (2011). A SpaceTime Scan Statistic for Detecting Emerging Outbreaks. Biometrics, 67(1):106–115.
 Unkel et al. (2012) Unkel, S., Farrington, C. P., Garthwaite, P. H., Robertson, C., and Andrews, N. (2012). Statistical methods for the prospective detection of infectious disease outbreaks: a review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 175(1):49–82.
 Veloso et al. (2013) Veloso, B., Iabrudi, A., and Correa, T. (2013). Towards efficient prospective detection of multiple spatiotemporal clusters. In Proceedings of the Brazilian Symposium on GeoInformatics, pages 61–72.
 Wagner et al. (2006) Wagner, M. M., Moore, A. W., and Aryel, R. M. (2006). Handbook of Biosurveillance. Academic Press.
 Wong et al. (2005) Wong, W.K., Moore, A. W., Cooper, G. F., and Wagner, M. (2005). What’s Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks. Journal of Machine Learning Research, 6:1961–1998.
 Wood (2006) Wood, S. (2006). Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC.
 Woodall (2006) Woodall, W. H. (2006). The Use of Control Charts in HealthCare and PublicHealth Surveillance. Journal of Quality Technology, 38(2):89–104.
Comments
There are no comments yet.