The analysis of infectious disease data is essential for strengthening surveillance systems and supporting the response to public health threats. Statistical modelling of infectious disease data needs to account for a variety of aspects including seasonalities, trends, the effect of concomitant infectious diseases, human activities that may favour the disease diffusion and external factors such as weather conditions (Lessler et al., 2016). Methods for the analysis of infectious diseases have continuously enriched the literature since the early twentieth century (Kermack and McKendrick, 1927), see Held and Paul (2013) for a detailed account.
Although the statistical literature emphasizes the need of proper accounting for serial correlation in time series of disease counts (Zeger et al., 2006; Unkel et al., 2012), regression models that assume independent observations are commonly used by national epidemiological surveillance systems (e.g., Noufaily et al., 2013). Failing to account for the presence of serial correlation may lead to incorrect inferential results. A popular illustration is the series of poliomyelitis infections in the United States of America during the period 1970–1983 (Zeger, 1988). Standard analysis of these data with a periodic regression model that includes annual and semi-annual seasonality indicates the presence of a strong statistically significant decreasing linear trend during the observation period. However, various models that extend the periodic regression to consider serial correlation agree that the trend in poliomyelitis is not statistically significant (Davis et al., 1999).
Among various approaches suggested in the literature to incorporate serial correlation in regression models for counts, this paper will focus on latent autoregressive models also known as parameter-driven models (Cox, 1981) or state space models for counts (Durbin and Koopman, 2012)
. These models are attractive because of their flexible hierarchical structure and the simple interpretation of the model parameters. In latent autoregressive models, the serial dependence is described through an unobserved stationary Gaussian autoregressive process. Inference in latent autoregressive models is cumbersome because the likelihood function is an intractable high-dimensional integral. Various simulation strategies for approximate likelihood and Bayesian inference have been suggested in the literature. Examples include the Monte Carlo expectation-maximization, Markov chain Monte Carlo simulation, penalized quasi-likelihood, importance sampling, Laplace and Gaussian approximations, see for instanceDavis and Dunsmuir (2016) and the references therein.
In this paper, we consider pairwise likelihood inference (Varin et al., 2011) based on a combination of likelihoods for pairs of observations. The pairwise likelihood offers a significant reduction of the computational cost related to the ordinary likelihood by replacing the high-dimensional likelihood integral with a limited set of double integrals. Pairwise likelihood methods have been already considered for fitting latent autoregressive models in Varin and Vidoni (2009), Davis and Yau (2011) and Ng et al. (2011).
This paper aims at highlighting methodological and implementation aspects that play an important role to obtain nearly efficient and numerically stable fitting of latent autoregressive models for counts using the method of maximum pairwise likelihood. The first issue considered in this paper is which pairs should be included in the pairwise likelihood for efficient estimation. We consider the pairwise likelihood of order (Varin and Vidoni, 2009) formed by pairs of observations far apart not more than
units and discuss selection of the order. To this end, we extend previous results on the asymptotic variance of the first-order pairwise likelihood estimator described inDavis and Yau (2011). We also consider the use of weights that downweight the contribution of pairs formed from distant observations.
A second issue treated in this paper regards the role of the numerical methods used for approximating the double integrals appearing in the pairwise likelihood function. Previous papers used standard Gauss-Hermite cubature (Davis and Yau, 2011) or adaptive Gauss-Hermite cubature (Ng et al., 2011). We compare Gauss-Hermite and adaptive cubature and discuss how the numerical integration method affects also the computation of standard errors and model selection.
The rest of the paper is organized as follows. Section 2 briefly reviews latent autoregressive models for counts and sets up the notation used through the paper. Section 3 details the pairwise likelihood approach to the estimation of latent autoregressive models. Finite sample properties of the pairwise likelihood estimators are assessed with simulations in Section 4. Section 5 illustrates the methodology with the analysis of the monthly counts of invasive meningococcal disease infections in the authors’ countries, Greece and Italy. These time series were obtained from the Surveillance Atlas of the European Center of Disease Control (https://ecdc.europa.eu/en/surveillance-atlas-infectious-diseases). The methods discussed in this paper have been implemented in a R (R Core Team, 2018) package called lacm (‘latent autoregressive count models’). Supplementary materials include additional simulation results and the code for replicating the analyses in Section 5 using our package lacm.
2 Latent autoregressive models for counts
Let be an observed time series of length . Latent autoregressive models are specified around an unobserved autoregressive Gaussian model
with and . Conditionally on the unobserved , the observed counts
are assumed to be independent Poisson random variables with conditional expectation
is a vector of covariates andthe corresponding vector of regression coefficients. The latent model (2) assumes that the linear predictor is imperfect and the role of the ‘error term’ is to capture the missing information. The inclusion of the latent variable in the linear predictor has the double effect of inducing serial correlation and overdispersion, a feature frequently observed in time series of disease counts (e.g., Imai et al., 2015). In fact, the marginal mean and variance of are
where is the stationary variance of the latent process Differently from linear models, the autocorrelation function
depends on the marginal moments and therefore on the regressors.
Likelihood inference for latent autoregressive models requires to approximate the -fold integral
where is the parameter vector.
Alternatively, the likelihood can be expressed as a series of nested one-dimensional integrals using the so-called filtered algorithm described, for example, in Cagnone and Bartolucci (2017). The likelihood is expressed as the product of the predictive densities, that is
The conditional density of given is computed as
The filtering algorithm proceeds through recursive evaluation of integrals (6) and (7) using the updating formula (8). The drawback of the filtering algorithm is the propagation of the numerical error through the nested integrals. This effect is implicitly observed in Cagnone and Bartolucci (2017). They report that approximation of the integrals (6) and (7) by Gauss-Hermite integration is inexpensive but inaccurate and fails to converge when is moderate to large, that is when the propagation effect is more evident. In contrary, the more expensive but also more accurate adaptive Gauss-Hermite integration works well for all values of
3 Pairwise fitting of latent autoregressive models
3.1 The pairwise likelihood of order d
The pairwise log-likelihood of order is defined as the weighted sum of the log bivariate densities of pairs of observations that are distant apart up to lag ,
The pairwise likelihood offers a significant reduction of the computational cost related to the ordinary likelihood by replacing the high-dimensional integral (5) with double integrals. Previous papers about pairwise likelihood inference for time series models (e.g., Varin and Vidoni, 2009; Ng et al., 2011) adopted rectangular weights of type
In order to evaluate the sensitivity of the pairwise likelihood to different choices of the weights, we consider non-rectangular weights constructed from kernel functions that gradually downweight the contributions from pairs of observations that are far apart in time,
where is a kernel function with bounded support, that is for . Some examples of kernels weights, later considered in the simulation studies of Section 4, are
where is the indicator function of the event A, that is is equal to one if and zero otherwise. See Figure 1 for an illustration of the above kernel weights.
– Figure 1 about here –
3.2 Inference and model selection
The maximum pairwise likelihood estimator of order is denoted as and it is the solution of the pairwise score equations
where are the individual pairwise scores
As diverges, the pairwise likelihood involves an increasing number of pairs of independent observations that do not contain any information about the dependence parameter that therefore cannot be consistently estimated. Thereafter, we consider fixed and study the limit behaviour of the pairwise likelihood estimator as diverges.
The limit distribution of is derived through a direct extension of the results contained in Davis and Yau (2011) for the pairwise likelihood of order one. The Taylor series expansion of the pairwise score equations (11) around the true value gives
Since is an erogidic sequence, then
Since the sequence of the individual pairwise scores is a stationary and strongly mixing sequence, then is asymptotically normal with zero mean and covariance matrix
Matrices and are referred as the sensitivity and variability matrices, respectively (Varin et al., 2011). The maximum pairwise likelihood estimator of order
has therefore a limit normal distribution with meanand asymptotic variance equal to the inverse of the Godambe information
As noted by Davis and Dunsmuir (2016), the consistency and asymptotic normality of maximum pairwise likelihood estimation gives a potential advantage over standard maximum likelihood estimation for the latent autoregressive model whose limit properties have not yet been fully argued.
The sensitivity matrix can be consistently estimated by the observed pairwise likelihood information
However, numerical approximation of the second-order derivatives is computationally demanding and potentially unstable. For this reason, we prefer an alternative estimator of that uses only first-order derivatives. The alternative estimator is derived from the second-order Bartlett identity that holds for each specific pair of observations and allows to rewrite the sensitivity matrix as
The above expression invites to consider the outer-product estimator
The variability matrix is consistently estimated by the weighted average
The weights correspond to the popular Bartlett kernel. Other types of kernels like quadratic spectral, Parzen or truncated might also be used similarly to the heteroskedasticity and autocorrelation consistent (HAC) estimators developed in the econometric literature (e.g., Newey and West, 1994). The finite-sample performance of can be improved by removing the central bias through replacement of the pairwise scores with centered versions where is the sample average of the individual pairwise scores for a given .
The choice of the window semi-length is crucial for consistency of . Parameter must grow with both the sample size and with the pairwise likelihood order because the larger the order the higher the persistence in the sequence of the individual pairwise scores . By analogy with the econometric literature on heteroskedasticity and autocorrelation consistent standard errors (Newey and West, 1994), we consider values of proportional to the cubic root of the number of terms in the pairwise likelihood,
where is a constant that depends on the model parameter and denotes the integer-part of . A precise identification of the constant seems difficult. Since sandwich standard errors tend to underestimate the estimation uncertainty, we consider a limited set of values for and select the one giving the most conservative standard errors for the components of The optimal is obtained by maximizing some relative measure of the variability because the variances of the components of are not in general comparable. For example, we consider the mean relative variance (MRV) defined as the mean of the ratios of the asymptotic variances computed for a given with respect to the asymptotic variances computed with ,
where is the number of model parameters, denotes the -th element in the diagonal of the inverse of the Godambe information . The quantity corresponds to the asymptotic variance of the -th component of . Notation is used here to underline that the matrix in the Godambe information is computed with a specific constant . In our experience, the selection of is little sensitive to the pairwise likelihood order used in the computation of the mean relative variance and therefore we suggest to compute MRV() at .
Another form of the mean relative variance statistic is employed for selection of the pairwise likelihood order . In this case, we seek the value of that gives the higher efficiency through minimization of the mean relative variance computed relatively to the asymptotic variance of the maximum pairwise likelihood estimators of order one,
We use the same estimator of at both the denominator and the numerator of MRV() in order to reduce the impact of the sampling variability and focus on identification of the optimal order . The ratio terms in MRV() are computed with the same value of formerly selected with the maximization of MRV(). We refer to the applications in Section 5 for illustration of the selection process for the quantities and .
Model selection is performed with the composite likelihood information criterion (Varin et al., 2011) that assumes the form
Like the usual Akaike Information Criterion, models with lower values of CLIC are preferred.
3.3 Numerical evaluation of the pairwise likelihood
The simpler option for numerical evaluation of the bivariate integrals (10) is the double Gauss-Hermite quadrature that has been used for computing pairwise likelihoods of latent autoregressive time series models in Davis and Yau (2011). However, the latent variable literature reports various difficulties with Gauss-Hermite quadrature rules, see, for example, Bianconcini (2014). Rabe-Hesketh et al. (2005) point out that in random effects models with large cluster sizes or intraclass correlations, the poor performance of the Gauss-Hermite quadrature may be attributed to sharp peaks of the integrands located between adjacent quadrature points.
A popular alternative to the standard Gauss-Hermite quadrature is its adaptive version that adjusts the quadrature locations using the mode and the curvature of the posterior density of the latent variables given the observations to provide a more accurate approximation of the integral (Liu and Pierce, 1994; Bianconcini, 2014). Such an approach has been considered by Ng et al. (2011) for composite likelihood estimation of time series models with a latent autoregressive structure. The application of adaptive Gauss-Hermite quadrature to pairwise likelihood requires to (i) maximize all the joint complete pairwise densities with respect to the latent variables and (ii) compute the hessian matrix at the maximum for the various values of explored by the optimization algorithm. Although adaptive Gauss-Hermite quadrature has proved to give more precise and stable results using a significantly smaller number of nodes than non-adaptive Gauss-Hermite quadrature, evaluation of its benefits for pairwise likelihood inference should also consider the cost of the possible large number of two-dimensional optimizations required to apply it for pairwise likelihood inference.
Another form of adaptive cubature known as h-adaptive is described in Genz and Malik (1980) and Berntsen et al. (1991) and implemented in the R package cubature (Narasimhan et al., 2017). As for any adaptive integration method, the h-adaptive method recursively partitions the integration domain into smaller subdomains and the same integration rule is applied to each subdomain until convergence is achieved. Genz and Kass (1997) noted two essential features of subregion-adaptive integration methods. Firstly, the allocation of points at which the integrand is to be evaluated is performed dynamically so that they are concentrated into those parts of the integration region where the integrand is most irregular. Secondly, the integration rules used in each subregion require a substantially smaller number of function evaluations as compared to Gaussian production rules. Moreover, the function evaluations over each subregion may be efficiently re-used to determine whether the region should be further subdivided.
Subregion-adaptive integration methods are usually based on polynomial integrating basic rules that can provide rapid convergence once the subdivision is fine enough so that a low degree polynomial approximation can provide an accurate approximation to the integrand. Moreover, preconditioning the integrand by bringing the integration domain of the double integrals (10) to the square unit makes the transformed integral easier to compute.
One important drawback of adaptive integration rules is that the resulting integral is discontinuous with respect to the model parameter because the cubature nodes depend on itself. The lack of continuity makes unstable the evaluation of the numerical derivatives in the sensitivity and variability matrices and needed for quantification of the estimation uncertainty and for model selection with the CLIC statistic.
4 Simulation studies
We conducted a series of simulation experiments for assessing the finite sample properties of the pairwise likelihood estimators. Since estimation of the regression parameters does not pose particular problems, we simulate from the latent autoregressive model without covariates. We set the intercept and consider six combinations of values for and to illustrate the performance of the methodology under different levels of overdispersion and serial correlation. The values considered for and are and For each of the six settings, we generated 500 time series of size . Pairwise likelihoods are approximated with both Gauss-Hermite quadrature with 20 nodes and h-adaptive cubature with a maximal tolerance of per each integral. Maximization of the pairwise likelihood is performed with the downhill simplex method (Nelder and Mead, 1965) with a relative convergence tolerance of
Initial parameter estimates for the maximization of the pairwise likelihood of order one are obtained using the method-of-moments estimator similarly to Zeger (1988); see also Davis et al. (1999) and Ng et al. (2011). For higher orders (), we used as initial guess the final estimates of the preceding order. The order of the pairwise likelihood was assumed to range between one and ten and for each order we consider both rectangular weights and the various kernel weights listed in Section 3.1. As a benchmark, we consider the integrated nested Laplace approximation (INLA) approach (Rue et al., 2009) as implemented in the R (R Core Team, 2018) package INLA (www.r-inla.org).
Figures 2 and 3 display the ratios between the simulation standard errors of INLA and maximum pairwise likelihood estimators for various orders computed with both h-adaptive cubature and Gauss-Hermite quadrature. The plots indicate no appreciable difference between h-adaptive cubature and Gauss-Hermite when (Figure 2), while h-adaptive cubature slightly outperforms Gauss-Hermite quadrature when the latent autoregressive process is characterized by a small or moderate autocorrelation ( or ) and (Figure 3). Nevertheless, the improvement obtained with h-adaptive cubature remains modest and seems not to compensate the higher computational cost and the numerical instabilities in the computation of the numerical derivatives needed for estimation of the sensitivity and variability matrices.
For the rest of our comparisons we resort to Gauss-Hermite quadrature for the comparison of maximum pairwise likelihood with INLA. Tables SM1 and SM2 of the Supplementary Materials summarize bias, standard deviation and the root mean square error of the maximum pairwise likelihood and INLA estimates. Results indicate that maximum pairwise likelihood estimators have efficiency very close to Bayesian estimators based on INLA for all parameters, provided a reasonable choice for the order of the pairwise likelihood. As expected and illustrated in Figures2 and 3, the efficiency of the maximum pairwise likelihood estimators of the univariate parameters and is essentially not affected by the pairwise likelihood order. Instead, the efficiency of the estimator of the autoregressive parameter depends on the order . Small orders, most often order one, are optimal when the autocorrelation parameter takes low to moderate values ( or ). In presence of strong autocorrelation () and simulations indicate that the optimal order is about .
– Figure 2 about here –
– Figure 3 about here –
Figures 4 and 5 compare the efficiency of maximum pairwise likelihood estimators with rectangular and Epanechnikov kernel-weights. Figures SM1 and SM2 of the Supplementary Materials extend the comparison to the other kernel weights listed in Section 3.1. The plots indicate that the quality of the estimators is not sensitive to the choice of the kernel weight function. The result is perhaps expected because in the present context the optimal order remains relatively small even in presence of strong serial correlation of the latent process and the use of unequal weights has therefore no sensible impact.
– Figure 4 about here –
– Figure 5 about here –
5.1 Invasive meningococcal disease in Greece
Invasive meningococcus infection is a rare but severe disease with relatively high case fatality and up to one fifth of all survivors suffering from long-term sequelae (Rosenstein et al., 2001). The surveillance of invasive meningococcal disease in Europe is coordinated by the European Center of Disease Control (ECDC) with input of case-based data reports from national surveillance systems. Thereafter, we illustrate latent autoregressive models in surveillance using data on the monthly number of meningococcal disease cases in Greece for the years 1999-2016. The data source is the ECDC Surveillance Atlas. We employ the time series until 2015 for model fitting and then compute the predictions of invasive meningitis cases in 2016.
The plot of the monthly number of cases of invasive meningococcal disease in Greece in the upper panel of Figure 6 reveals annual seasonality and a decreasing time trend.
– Figure 6 about here –
We model the Greek invasive meningococcal disease time series with a latent autoregressive model with expected value , where the linear predictor is specified in a way to account for the annual seasonality and trend,
The trend term has been scaled by the number of observations (204 months) in way to report the model coefficients approximatively on the same scale.
Numerical integration for computation of the pairwise likelihood is performed through Gauss-Hermite quadrature with 20 nodes. Figure 7 displays the mean variance ratios used for selection of and . The left panel of Figure 7 shows that the mean relative variance MRV() reaches its maximum in correspondence of , while in the right panel we see that the mean relative variance MRV() is an increasing function of the pairwise likelihood order and thus suggests to select the pairwise likelihood of order one.
– Figure 7 about here –
The parameter estimates and the corresponding standard errors obtained with the latent autoregressive model fitted with the pairwise likelihood of order one are displayed in Table 1. For comparison purposes, we report also the results obtained with INLA. As expected, the results confirm significant seasonality and a discreasing trend of the invasive meningococcal disease cases in Greece. Maximum pairwise likelihood estimates and standard errors with Gauss-Hermite quadrature are in close agreement with results based on INLA. The CLIC statistic for the latent autoregressive model is 1660.1, a value somehow smaller than the reduced model fitted without autocorrelation obtained setting (CLIC=) and much smaller than the independence model corresponding to and (CLIC=).
|intercept||2.79 (0.11)||2.75 (0.11)|
|trend||-1.67 (0.21)||-1.60 (0.19)|
|sin term||0.45 (0.03)||0.46 (0.05)|
|cosine term||0.27 (0.04)||0.27 (0.05)|
|0.57 (0.17)||0.60 (0.14)|
|0.07 (0.02)||0.07 (0.02)|
The lower panel of Figure 6 displays the observed and predicted meningococcal disease cases in Greece together with the corresponding upper prediction bound. The in-sample predictions are computed with the best linear predictor
where and the expressions for the first- and second-order moments are given in formulas (3) and (4). Since in surveillance applications the interest lies on identification of disease outbreaks, we also compute approximate upper bounds of level as
where . We consider a non-parametric estimate of given by the empirical
-level quantile of the standardized prediction errors. The comparison of observations and in-sample predictions until the year 2015 in the lower panel of Figure 6 indicates a satisfactory model fitting and retrospectively identifies two outbreaks.
We also report in the lower panel of Figure 6 the predicted meningococcal disease cases and corresponding prediction upper bounds for the twelve months in 2016. These out-of-sample predictions are again approximated with the best linear predictor . The corresponding upper bounds of level are approximated with using the same quantile derived for the in-sample predictions. Out-of-sample predictions are close to the observed meningitis cases for the year 2016 except for April and December where the observed cases exceed the corresponding 95% upper prediction limit.
5.2 Invasive meningococcal disease in Italy
The upper panel of Figure 8 shows the time series plot of the monthly number of meningococcal disease cases in Italy for the years 1999-2016.
– Figure 8 about here –
The main feature of the Italian series is a level shift corresponding to a sensible reduction of the monthly number of cases after March 2005. This reduction can be explained as a consequence of increasing vaccination coverage due to the 2005–2007 National Italian Vaccine Plan in which the conjugate MenC vaccination was recommended to Italian citizens (Stefanelli et al., 2009; Pascucci et al., 2014).
We consider the latent autoregressive model with linear predictor
where the binary variabletakes values 1 and 0 for observations before and after March 2005 (i.e., the 75th observation in the dataset), respectively.
Our criteria for the selection of the constant and identification of the order of the pairwise likelihood suggest and (see Figure 9).
– Figure 9 about here –
Table 2 summarizes the parameter estimates and standard errors obtained by maximization of the pairwise likelihood of order four. Again maximum pairwise likelihood estimates are in close agreement with INLA and confirm the significant level shift in the invasive meningitis cases after March 2015. The CLIC statistics support the latent autoregressive model (CLIC) compared to the model without autocorrelation (CLIC) and the standard generalized linear model (CLIC).
|intercept||2.54 (0.04)||2.55 (0.06)|
|level shift||0.37 (0.07)||0.36 (0.09)|
|sine term||0.47 (0.03)||0.46 (0.04)|
|cosine term||0.27 (0.04)||0.26 (0.04)|
|0.69 (0.07)||0.74 (0.11)|
|0.04 (0.01)||0.04 (0.01)|
The observed and predicted numbers of meningococcal disease cases in Italy together with the corresponding upper bound intervals are illustrated in the lower panel of Figure 8. Like for the Greek data, predictions were computed with the best linear predictor. The in-sample predictions and prediction intervals illustrated in the plot suggest that the latent autoregressive model fits adequately the observed data and retrospectively identifies one disease oubreak corresponding to March 2005. The comparison of the out-of-sample predictions with the observed disease counts for the year 2016 reveals that for most of the months the observed number of cases is very close to the corresponding upper prediction limit. This finding is not surprising given the unusual large number of invasive meningococcal disease cases diagnosed in the Tuscany region between 2015 and 2016 (Stefanelli et al., 2016).
This paper discussed several methodological and implementation aspects of pairwise likelihood inference for latent autoregressive models for counts. We considered a pairwise likelihood of order and investigated the effect of the pairwise likelihood order on the efficiency of the maximum pairwise likelihood estimates. Empirical results indicated that small orders are optimal when the autocorrelation parameter takes low to moderate values while higher orders should be considered in presence of strong autocorrelation. Our results indicate that the performance of pairwise likelihood in latent autoregressive count models is not sensitive to the use of weights that downweight the contribution of pairs of observations that are far apart in time Nevertheless, it is of our interest to explore the effectiveness of kernel weights in spatial and spatio-temporal settings where there is already some evidence that non-rectangular weights improve the efficiency of maximum pairwise likelihood estimators (Bevilacqua et al., 2012).
Regarding the role of the numerical methods used for approximating the double integrals involved in the pairwise likelihood function, we found that when the latent process is characterized by a small or moderate autocorrelation, h-adaptive cubature may slightly outperform the Gauss-Hermite quadrature. However, adaptive cubature methods have the drawback that the resulting integral is discontinuous entailing instability in the evaluation of the estimation uncertainty.
As the amount of quality surveillance dataset increases, there is also a growing interest towards joint modelling of multiple diseases in spatio-temporal settings (Baker et al., 2017). The reduction of the computational cost obtained with composite likelihood methods is particularly attractive for fitting spatio-temporal extensions of the latent autoregressive model for multiple diseases and will be investigated in future work.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska–Curie grant agreement no. 699980.
- Baker et al. (2017) Baker, J., White, N., Mengersen, K., Rolfe, M. and Morgan, G.G. (2017). Joint modelling of potentially avoidable hospitalisation for five diseases accounting for spatiotemporal effects: A case study in New South Wales, Australia. PLoS ONE, 12(8): e0183653.
- Berntsen et al. (1991) Berntsen, J., Espelid, T.O. and Genz, A. (1991). An adaptive algorithm for the approximate calculation of multiple integrals. ACM Transactions on Mathematical Software 17, 437–451.
- Bevilacqua et al. (2012) Bevilacqua, M., Gaetan, G., Mateu, J. and Porcu, E. (2012). Estimating space and space-time covariance functions for large data sets: a weighted composite likelihood approach. Journal of the American Statistical Association 107, 268–280.
- Bianconcini (2014) Bianconcini, S. (2014). Asymptotic properties of adaptive maximum likelihood estimators in latent variable models. Bernoulli 20, 1507–1531.
- Cagnone and Bartolucci (2017) Cagnone, S. and Bartolucci, F. (2017). Adaptive quadrature for maximum likelihood estimation of a class of dynamic latent variable models. Computational Economics 49, 599–622.
- Cox (1981) Cox, D.R. (1981). Statistical analysis of time series: some recent developments. Scandinavian Journal of Statistics 8, 93–115.
- Davis and Dunsmuir (2016) Davis, R.A. and Dunsmuir W.T.M. (2016). State Space Models for Count Time Series. In Handbook of discrete-valued time series. Davis et al. (eds). CRC Press.
- Davis et al. (1999) Davis, R.A., Dunsmuir, W.T.M. and Wang, Y. (1999). Modelling Time Series of Count Data. Asymptotics, Nonparametrics, and Time Series (Subir Ghosh, editor) Marcel-Dekker, New York, 63–114.