Stationary models and properties are important in theoretical and applied problems in econometrics and in time series models. In particular, integer-valued Markov processes have dragged the attention of many theoretical and applied scientist. Part of the motivation for their study is to have practical models for time-evolving counts, i.e. without the need to rely on misspecified continuous state space models such as ARMA time series models or diffusion processes.
There are, at least, two seemingly different approaches to build integer-valued Markov models or observations evolving in discrete or continuous time, namely the Markov chain approach(e.g. Karlin and Taylor, 1975) and the stochastic thinning or contraction technique. The latter is largely based on the concept of self-decomposability (e.g. Steutel and van Harn, 2003) or generalizations of it (e.g. Joe, 1996; Zhu and Joe, 2010a). Indeed, for count data time series, contributions such as McKenzie (1986, 1988), Al-Osh and Alzaid (1987), Alzaid and Al-osh (1990), Du and Li (1991) and Al-Osh and Aly (1992) used the thinning operator to construct AR-type models with Poisson, negative binomial and geometric stationary distributions. A more general approach, i.e. for a wider choice of marginal distributions, is presented in the groundbreaking contribution by Joe (1996), see also Jørgensen and Song (1998). For up to date accounts on the topic we refer to McKenzie (2003) and Davis et al. (2016).
Much of the theory of stationary models, with arbitrary but given marginal distributions, has been developed for discrete-time Markov models, with the notable exception of the work set forth by Barndorff-Nielsen and Shephard (2001), where the idea of self-decomposability is used to represent Ornstein–Uhlenbeck Lévy-driven stochastic differential equation and thus opening a wide class of stationary continuous time Markov processes. For continuous time and discrete state-space models, clearly the theory of Markov chains offers an excellent and general modeling alternative. However, within such framework, the availability of closed form expressions for the transition probabilities are not always at hand, and thus complicating estimation and simulation procedures.
Indeed, from a statistical perspective, the availability of computable transition probabilities is always desirable, e.g. having stochastic equations, intensity matrices and/or expressions for Markov generators, frequently leads to overparametrized situations or numerical complications.
In this paper, we present a construction of stationary Markov chains with negative-binomial distributed marginals. The construction is valid in discrete and continuous time, and has the appealing feature of having a simple and closed form of the corresponding transition probabilities. We build our construction on the idea introduced by Pitt et al. (2002) and subsequently generalized to the continuous time case by Mena and Walker (2009). Their approach is based on the distributional symmetry of reversibility and is general enough to encompass other approaches based on thinning, such as the models in Joe (1996). The idea can be recast as follows: given a desired marginal distribution, say with probability , one defines the dependence in the model by introducing another -arbitrary- latent probability . With these probabilities at hand, one computes and define the one-step transition probability by
where the expectation is taken with respect to conditioned on . This approach resembles the Gibbs sampler method to construct reversible Markov chains and has a clear Bayesian flavor, i.e. the prior , the likelihood model , the posterior and the predictive (1). Lousily speaking any data predictive distribution constitutes a well-defined Markov kernel. As such, the method is then very general, and for this reason has been widely used to construct time series models of the AR or ARCH-type, e.g. Pitt and Walker (2005); Mena and Walker (2005); Contreras-Cristán et al. (2009) to name a few.
There are various ways to generalize the above construction to the continuous-time case, e.g. by computing the n-step ahead transition corresponding to (1) and embedding it to continuous time, or by finding the conditions such that the Chapman-Kolmogorov equations are also satisfied in continuous time. This latter approach is followed by Mena and Walker (2009) to represent well-know continuous time models such as diffusion processes and Markov chains. The idea is to allow those parameters in -and not in -, to vary on time and find their functional form such that the Chapman-Kolmogorov equations are satisfied. While some appealing families of diffusion processes have been found to have representation (1), under the above extension to continuous time (see, e.g. Anzarut et al., 2018; Anzarut and Mena, 2018), it is not always easy to find analytical conditions that meet Chapman-Kolmogorov equations. This later observation has prevented the method to be generalized to the general class of continuous Markov processes. Here we discover the conditions to apply this construction to the class of negative-binomial marginal distributions.
The model we find, has a neat expression for the transition probabilities, and turns out to correspond to a well known class of birth and death processes (see Feller (1950) and Karlin and Taylor (1975)) for which a closed form expression for their transition probabilities is not available elsewhere, to the best of our knowledge. Hence, this constitutes an appealing addition to the applied literature on such models.
The class negative-binomial marginals is justified for its generality, i.e. it includes the geometric distribution as a particular case and the Poisson as a limiting case. Furthermore, as proven byWolpert et al. (2011), all stationary time reversible Markov processes, whose finite dimensional marginal distributions are infinitely divisible, are branching process with Poisson or Negative Binomial marginal univariate distributions. Not surprising, discrete and continuous time models with negative binomial marginals have been widely studied in the literature, e.g. Latour (1998); Joe (1996); Zhu and Joe (2003, 2010a, 2010b). However, with the exception of Zhu and Joe (2010b), where a numerical inversion technique is used to approximate the continuous time transition density of a certain negative-binomial model, no closed form expression has been given. Indeed, in particular we provide with a closed expression for transition probablity in Zhu and Joe (2010b) model. Furthermore, we highlight the convenience of having a simple expression for transition probabilities with a simulation study and a real data analysis. The latter applies our approach to a crime reports dataset studied by Gorgi (2018). We compare the performance in terms of computational time with the method of Zhu and Joe (2010b) when the transition is evaluated with numerical methods.
The structure of this document is as follows. In Section 2 we introduce the construction of stationary Markov chains with negative binomial distributed marginals. In Section 3 we apply our model using simulated data. Section 4 deals with real data, in particular we model real-time series of crime reports in the city of Blacktown in Australia. Final points and conclusions are deferred to Section 5.
2 Negative Binomial Stationary Markov chain
Following the construction in Mena and Walker (2009), described above, we build a set of transition probabilities that drives a class of reversible Markov chains with negative-binomial invariant distribution. Hence, our objective is to construct a model with stationary negative-binomial distribution, i.e. with mass probabilities given by
Now, we introduce the dependence in the model by assuming . Marginalizing , and after some elementary algebra, it follows that
which then leads to the “ posterior ” distribution given by the shifted negative-binomial distribution , with corresponding conditional distribution
Then, one can build a reversible stochastic process whose one-step transition probabilities, , take the form
for . As such, the resulting model is a well defined discrete-time, integer-valued, autoregressive-type reversible model, for which the transition (2) leaves a distribution invariant. A stochastic equation (SE) of this model is given by
is a sequence of i.i.d. random variables with common NBdistribution, and has Bin distribution. A similar expression to the SE (3) was derived in Zhu and Joe (2003) to characterize a reversible Markov process with negative binomial invariant distribution.
In order to generalize the above model to the continuous-time case we allow the “ dependence parameter ” to depend on time, namely through a function , and thus find the conditions on such a function such that the corresponding transition density satisfies the Chapman-Kolmogorov equation. Such task is simplified via the Laplace transform, , associated to the corresponding transition probability (2), i.e. , which is given by
where . Notice that here, we have already made explicit the dependence on the time parameter.
The transition probabilities (2) satisfy the Chapman-Kolmogorov equations if and only if
In such case, turns out to be a reversible Markov process with negative-binomial invariant measure.
First notice that in terms of the Laplace transform, the Chapman-Kolmogorov equations corresponding to are satisfied if and only if
where the time-effect in the law of enters non-homogenously, i.e. through a function , as we are constructing a stationary process. Hence, the right-hand side of (5) takes the form
On the other hand, the left-hand side of (5) is given by
Hence, in order for Chapman-Kolmogorov’s equations to be fulfilled, it is necessary that the following equalities hold,
Those equations are satisfied whenever
whose solution is given by for . Thus, for , given that . Therefore, for this , the transition probabilities satisfy the Chapman-Kolmogorov equations. ∎
Hence, using Proposition 1 the transition probabilities simplify as
Given the discrete state-space nature of the model we have just constructed, it is natural to think that the model has a corresponding continuous-time Markov chain. In fact, the turns out to be the simple birth, death and immigration process, where the infinitesimal rates associated to are given by
Therefore, the process reduces to a simple birth, death and immigration process with birth rate , death rate and immigration rate , or equivalently, , and (cf. Kelly, 2011). It is worth noticing that, since , we have that , which is consistent with the model. Alternatively, we can characterize the above process as the birth and death process with rates and , also known as linear growth process (cf. Karlin, 1975). To the best of our knowledge expression (10) is new in the literature.
Furthermore, generalizing the stochastic equation (3), we deduce that the birth, death and immigration process satisfies the following generalized branching operation
where , ’s are i.i.d. r.v.’s with common NB distribution and has NB, with . That is to say, the model enjoys the branching property. Using a different approach, Zhu and Joe (2003) derived a different stochastic equation, from which no closed expression for the transition function is available. In (Zhu and Joe, 2010b)
, a numerical inversion technique of the corresponding characteristic function is used for approximation of transition probabilities. Hence, stochastic equation (11), together with its transition probabilities, result a very appealing alternative to it, for simulation and estimation.
On the other hand, within the discovery by Wolpert et al. (2011), finding expression (10), completes the availability of closed expression for transition probabilities for the class of continuous-time reversible Markov processes with non-negative integer values, that can be defined via a thinning operation.
As a twofold observation, such transitions turn out to be finite sums of positive terms, which is a very appealing property for simulation and estimation procedures.
Also, one notices that if , has a geometric stationary distribution. In such case, the rates are given by and , implying that , which is a variation of the simple queue, , model characterized as the birth and death process with rates and .
3 Simulation Experiment
To evaluate the performance of our construction, we test the model with different simulated examples. Let us consider two different datasets simulated from (11). For the first dataset we assume equally spaced data, i.e. , while the second dataset features data generated at exponential times with intensity parameter . We consider two scenarios: the first one consists in a single dataset of observations; the second one consists in different datasets of observations. In both cases, we run single experiments taking in consideration the first observations, the first and the full dataset. We keep fixed different values of the parameters of interest, in particular, we are interested in the estimation of the parameters of the negative binomial distribution, , namely the probability of success in each experiment, , and the number of failures until the experiment is stopped (), respectively. In addition to these two parameters, we need to estimate the dependency function . This quantity involves the parameter and .
A Maximum Likelihood Estimation (MLE) approach is adopted in both scenarios. We set the parameter equal to and . Regarding the negative binomial distribution, we set the probability of success in each experiment, and and the number of failures until the experiment is stopped, , equals to and .
In Table 1, we report the results for a single chain for different values of , and when the data are assumed to be equally spaced. On the other hand, Table 2 shows the results over different datasets, when the data are assumed to be equally spaced. In particular, we report the means over the
experiments. Standard deviations are reported in brackets. The results show that taking all the dataset improves the results, while working with only the initial part of the dataset (such asor observations) leads to different results.
The same analysis has been conducted when the simulated data are generated at exponential times with intensity parameter . In Table 3, we report the results over different datasets and we report the means over the experiments and the standard deviations.
Therefore, the evidence suggests the estimation method works correctly for the negative binomial Markov chain and we can proceed by applying it to real data experiment as in the following section
4 Application to Crime Data
This section is devoted to the study of the monthly number of offensive conduct reported in the city of Blacktown, Australia, from January 1995 to December 2014. Following Gorgi (2018), we employ our time series approach to the New South Wales (NSW) dataset of police reports provided by the NSW Bureau of Crime Statistics and Research and currently available at http://www.bocsar.nsw.gov.au/.
shows the plot of the time series and the presence of two peaks, related to high level of criminal activities in 2002 and 2010. Looking at the data, the sample mean and variance areand , respectively. As suggested by Gorgi (2018), this indicates over-dispersion, and thus a negative binomial distribution for the error term may be a more suitable model for the data.
As in the simulated experiments, for the negative binomial Markov process we need to estimate the three parameters of interest, . In particular, we run a maximum likelihood estimator for the parameter of interest over all the sample size. The parameter, which represents the number of failures until the experiment is stopped in the negative binomial distribution, takes value . The parameter or the so-called probability of success in each experiments is equal to an the estimation of is equal to . We compare our procedure with the Joe’s transition probability computed as in Zhu and Joe (2010b). This procedure can be recast in two steps:
where is the characteristic function. In our case, , the set of values assumed by the time series, is and the characteristic function is
Then (12) becomes
The second step consist of evaluating
In Table 4, we compare the estimation values of and the computational times of our transition probability (OTB), Joe’s transition probability by computing numerically the integral in Matlab (JNI) and Joe’s transition probability by using an arbitrarly precision integral (JPI). Thus, the results confirm that the three methods lead to the same results, but our transition probability is faster than the other two methods, moving from 17 seconds to 46 minutes (for the numerical integration) and 5 hours (for the arbitrary precision integral). Due to the high time consuming algorithm, in the forecasting approach, we decide to run only our methods.
|Transition probability||Time (in s)|
|OTB||(6.0865, 0.6031, 0.6848)||17.65 sec|
|JNI||(6.0865, 0.6486, 0.7237)||2575.60 sec|
|JPI||(6.5110, 0.4128, 0.9002)||17085.02 sec|
Following Gorgi (2018), we perform a pseudo out-of-sample experiment to compare our results with the forecast performances in Gorgi (2018). We divide the time series into two subsamples: the first observations are the in-sample analysis and the last observations are the out-of-sample analysis or forecasting evaluation sample. In particular, the in-sample analysis is expanded recursively.
The accuracy of the forecasting procedure is measured in terms of both point and density forecasting. Hence, we evaluate the point forecast accuracy by mean of the mean square error (MSE), which is
On the other hand, we measure the density forecasting accuracy by means of the log predictive score criterion which is
Table 5 shows the results in terms of both point and density forecasting at different forecasting horizons . Finally, we note that our model compared to the different models proposed in Gorgi (2018) perform well . In particular when it compared to his GAS-NBINAR model. Overall, we can conclude that the Negative-binomial Markov process can be useful in different applications for estimation and forecasting purposes.
|Mean square error||15.461||17.627||19.794||21.138|
|Log score criterion||-2.731||-2.797||-2.857||-2.873|
We give a close expression to the transition density of the Negative Binomial continuous time Markov model. This model, used in both discrete and continuous time, is of great interest when modeling integer-valued time series and the construction we present yields a much simpler simulation and estimation procedures. Furthermore, our approach links nicely with models coming from the continuous time Markov chains literature, thus opening a gateway for applications also in that area.
Fabrizio Leisen was supported by the European Community’s Seventh Framework Programme [FP7/2007-2013] under grant agreement no: 630677. The second author gratefully acknowledges the support of CONACYT project 241195 and a Visiting Scholar Fulbright Scholarship. The forth author acknowledges financial support from the European Union Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 796902.
- Al-Osh and Aly (1992) Al-Osh, M. A. and Aly, E.-E. A. A. (1992). First order autoregressive time series with negative binomial and geometric marginals. Communications in Statistics - Theory and Methods, 21(9):2483–2492.
- Al-Osh and Alzaid (1987) Al-Osh, M. A. and Alzaid, A. A. (1987). First-order integer-valued autoregressive (inar(1)) process. Journal of Time Series Analysis, 8(3):261–275.
- Alzaid and Al-osh (1990) Alzaid, A. A. and Al-osh, M. (1990). An integer-valued pth-order autoregressive structure (inar(p)) process. 27(2):314–324.
- Anzarut and Mena (2018) Anzarut, M. and Mena, R. (2018). A harris process to model stochastic volatility. Econometrics and Statistics.
- Anzarut et al. (2018) Anzarut, M., Mena, R. H., Nava, C., and Prünster, I. (2018). Poisson Driven Stationary Markov Models. Journal of Business & Economic Statistics.
- Barndorff-Nielsen and Shephard (2001) Barndorff-Nielsen, O. E. and Shephard, N. (2001). Non-Gaussian Ornstein–Uhlenbeck-based models and some of their uses in financial economics. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2):167–241.
- Contreras-Cristán et al. (2009) Contreras-Cristán, A., Mena, R. H., and Walker, S. G. (2009). On the construction of stationary ar(1) models via random distributions. Statistics, 43(3):227–240.
- Davis et al. (2016) Davis, R., Holan, S., Lund, R., and Ravishanker, R. (2016). Handbook of Discrete-Valued Time Series. CRC Press, Taylor and Francis Group.
- Du and Li (1991) Du, J. D. and Li, Y. (1991). The integer-valued autoregressive (inar(p)) model. Journal of Time Series Analysis, 12(2):129–142.
Feller, W. (1950).
An Introduction to Probability Theory and its Applications. Wiley.
Gorgi, P. (2018).
Integer-valued autoregressive models with survival probability driven by a stochastic recurrence equation.Journal of Time Series Analysis, 39(2):150–171.
- Joe (1996) Joe, H. (1996). Time series models with univariate margins in the convolution-closed infinitely divisible class. Journal of Applied Probability, pages 664–677.
- Jørgensen and Song (1998) Jørgensen, B. and Song, P. X.-K. (1998). Stationary time series models with exponential dispersion model margins. Journal of Applied Probability, pages 78–92.
- Karlin and Taylor (1975) Karlin, S. and Taylor, H. (1975). A First Course in Stochastic Processes. Academic Press, New York.
- Latour (1998) Latour, A. (1998). Existence and stochastic structure of a non-negative integer-valued autoregressive process. J. Time Ser. Anal., pages 439–455.
- McKenzie (1986) McKenzie, E. (1986). Autoregressive moving-average processes with negative-binomial and geometric marginal distributions. 18(3):679–705.
- McKenzie (1988) McKenzie, E. (1988). Some arma models for dependent sequences of poisson counts. 20(4):822–835.
- McKenzie (2003) McKenzie, E. (2003). Ch. 16. Discrete variate time series. Handbook of statistics, 21:573–606.
- Mena and Walker (2005) Mena, R. H. and Walker, S. G. (2005). Stationary autoregressive models via a bayesian nonparametric approach. Journal of Time Series Analysis, 26(6):789–805.
- Mena and Walker (2009) Mena, R. H. and Walker, S. G. (2009). On a construction of markov models in continuous time. Metron, 67(3):303–323.
- Pitt et al. (2002) Pitt, M. K., Chatfield, C., and Walker, S. G. (2002). Constructing first order stationary autoregressive models via latent processes. Scandinavian Journal of Statistics, 29(4):657–663.
- Pitt and Walker (2005) Pitt, M. K. and Walker, S. G. (2005). Constructing stationary time series models using auxiliary variables with applications. Journal of the American Statistical Association, 100(470):554–564.
Steutel and van Harn (2003)
Steutel, F. and van Harn, K. (2003).
Infinite Divisibility of Probability Distributions on the Real Line. Chapman and Hall/CRC Pure and Applied Mathematics (Book 259).
- Wolpert et al. (2011) Wolpert, R. L., Brown, L. D., and May (2011). Markov infinitely-divisible stationary time-reversible integer-valued processes.
- Zhu and Joe (2003) Zhu, R. and Joe, H. (2003). A new type of discrete self-decomposability and its application to continuous-time markov processes for modeling count data time series. Stochastic Models, (19):235–254.
- Zhu and Joe (2010a) Zhu, R. and Joe, H. (2010a). Count data time series models based on expectation thinning. Stochastic Models, (26):431–462.
- Zhu and Joe (2010b) Zhu, R. and Joe, H. (2010b). Negative binomial time series models based on expectation thinning operators. Journal of Statistical Planning and Inference, (140):1874–1888.