Mean and variance of first passage time in Markov chains with unknown parameters

02/18/2019 ∙ by Carlos Hernandez-Suarez, et al. ∙ Apple Inc 0

There are known expressions to calculate the moments of the first passage time in Markov chains. Nevertheless, it is commonly forgotten that in most applications the parameters of the Markov chain are constructed using estimates based upon empirical data and in those cases the data sample size should play an important role in estimating the variance. Here we provide a Monte Carlo approach to estimate the first two moments of the passage time in this situation. We illustrate this method with an example using data from the biological field.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In an ergodic Markov chain with states, first passage time is defined as the time to reach a particular set of states for the first time, starting from a state distribution . The first passage time is important, for instance, in Markov Population Models to calculate the longevity of individuals as well as other relevant life history traits (LHTs), like generation time and the basic reproductive number (Caswell, 2009).

In very few real life situations, the parameters of a Markov chain are known exactly, and when this occurs, is due mostly by model assumptions. For instance, consider a Markov chain with states corresponding to the outcome when rolling a dice, thus

and we assume the transition probabilities between states

and as by model, but most of the time, the parameters of the Markov Chain have inherent error, which is closely related to the sample size.

We can use , the fraction of transitions from state to out of as an estimate of , (Guttorp, 1995). We provide a simple methodology based in Monte Carlo simulations that incorporates sample size in our estimation of the first passage time in Markov chains.

2 Methodology

To calculate the fist passage time from an initial state to a set of states , we can transform the Markov chain so that every state in is absorbing and then analyze the time to extinction. Assume a Markov chain has transient and absorbing states that is decomposed as:

where is a matrix containing the transitions between the transient states, is a matrix containing the transitions between transient states to absorbing states, is a matrix of zeros and

is the identity matrix of size

. Then, starting from the initial distribution , the expected value and variance of the time to extinction of the process, , are:



is a column vector of 1’s and

is the fundamental matrix (Iosifescu, 1980).

If is unknown, its parameters are estimated through observation or experimentation, giving rise to a matrix , some random realization of matrix , thus, in the most common scenario, the moments in (1) are not calculated using but instead, with a particular realization .

Let be an estimate of the fundamental matrix . The Variance of the passage time can be written by conditioning on :


Row of has parameters that were estimated with , the observed fraction of transitions from state to each state. A sample of size from follows a multinomial distribution with parameters and , with expected value and variance-covariance matrix :

For details see Guttorp (1995), Theorem 2.16, p.65

We can sample from the multinomial distribution for each row of , that is, simulate and use it to calculate and thus the moments in (1). The average and variance over the simulations can be plugged in (2) to provide an estimate of the variance of the passage time.

We can thus summarize the Monte Carlo estimation as follows:

  1. For every row in , simulate a vector of transitions from a multinomial distribution with parameters and , where is the total number of transitions observed from state to all other states. Let

  2. Use the samples of the previous step to construct a matix .

  3. Use the matrix to calculate and with (1) calculate and . Record these and as the outcome of the -th simulation.

  4. Repeat from step 1.

The average of the ’s is used to estimate , while the variance of the ’s estimates . Adding these two yields an estimate of the variance of the Longevity, as given in (2).

3 Example

Biologists use Markov chains to model the transition of individuals through different developmental stages. Death is an absorbing state. Due to the use of the recurrence

to express the change in the stage-specific composition of the population at two subsequent discrete generations, it is customary among biologists to express Markov chains using its transpose, which is called a Matrix Population Model, MPM, for details, see Caswell (2009), Hernandez-Suarez et al. (2019).

An example of transitions between four stages is depicted in Figure (1). In a given state , an individual can move to the next stage with probability (graduation), stay one more unit of time in that state with probability or die with probability .

Figure 1: An example life cycle graph with five stages.

In Hernandez-Suarez et al. (2019), a data set for the life cycle of the kissing bug Eratyrus mucronatus (Reduviidae, Triatominae) was introduced. There are seven stages: egg, nymph states 1-5 and the adult stage. Table (1) shows the estimates of each of the parameters and the sample size to estimate the parameters of each stage. For details in the estimation method, see Hernandez-Suarez et al. (2019).

Egg 139/676 59/676 478/676 676
N1 89/669 52/669 528/669 669
N2 74/390 15/390 301/390 390
N3 60/466 14/466 392/466 466
N4 59/465 1/465 405/465 465
N5 55/912 4/912 853/912 912
Adult 0 55/2570 2515/2570 2570
(*) Total number of transitions observed per stage.
Table 1: Estimates of the stage-specific parameters of a matrix population model for the kissing bug Eratyrus mucronatus, from non-individualized cohorts reared in the laboratory at C and 65% relative humidity (see suppl. material in Hernandez-Suarez et al. (2019)).

The estimated matrix for the (transposed) Markov Chain is:

3.1 Monte Carlo estimation

For every stage , the parameter space has expected value and variance-covariance matrix :


Using this data, we performed simulations and obtained the sample mean and variance and the moments obtained using (1). These are shown in Table 2.

We analyzed the effect of the sample size on the variance of the longevity. To do this, we reduced the number of transitions (last column of Table 1) to a fraction of the original size, and use these for the simulation process. The effect of the reduction in sample size is shown in Figure 2.

Figure 2:

Effect of sample size on the standard deviation of the longevity. The percentage increase at different fractions of the original sample size are shown.

4 Discussion

The difference between the calculated expected values and standard deviation of the longevity using (1) vs. those obtained using Monte Carlo methods are negligible, which is mainly due to the relatively large sample sizes for every stage in the study (see Table 1). In experimental biology, sample sizes may be large for most plants and invertebrates, but is more difficult for vertebrates or even mammals, where follow up is expensive or extremely difficult. Figure 2 shows that when the sample size was reduced to a tenth, the difference between both calculations becomes noticeable: there is an increase of in the standard deviation between both Monte Carlo simulations.

It must be noticed that in the simulations with sample size reduced, we only reduced the value of in eq. (3) but kept the same estimates of , and shown in Table 1. Since these values were achieved with the original sample size, we expect that the increase in variance caused by a reduction in sample size would be larger than the indicated in Fig. 2.

Calculated Monte Carlo
Expected value 30.658 31.087
Standard deviation 44.943 46.023
Using equation (1)
Table 2: Comparison of expected value and standard deviation for the longevity (in weeks) of the kissing bug Eratyrus mucronatus. Results of simulations.


  • Caswell (2009) Caswell, H. (2009) Stage, age and individual stochasticity in demography. Oikos, pp. 1763–1782.
  • Guttorp (1995) Guttorp, P. (1995) Stochastic modeling of scientific data. Stochastic modeling. Chapman & Hall, London; New York, 1st edition.
  • Hernandez-Suarez et al. (2019) Hernandez-Suarez, C., Medone, P., Castillo-Chavez, C. & Rabinovich, J. (2019) Building matrix population models when individuals are non-identifiable. Journal of theoretical biology, 460, 13–17.
  • Iosifescu (1980) Iosifescu, M. (1980) Finite Markov processes and their applications. Dover Publications Inc., New York, USA.

5 Supplementary material

Python code to estimate mean and variance of transition time using multinomial distribution. (Performs simulations in about seconds with a GHz Intel Core i processor)

 #!/usr/bin/env python3.6
# -*- coding: utf-8 -*-
import numpy as np

#================ BEGIN DECLARATIONS ==================================

#Declaring number of simulations:

nsim = 1000

#Declaring sample size for every state:

#Declaring matrix U:

[0,Ψ0,Ψ0,Ψ0,Ψ0,Ψ0,Ψ2515/2570] ]

#Declaring initial distribution: (start in first stage in this example)

v= np.array([np.zeros(len(U))])
v[0][0] = 1

#================ END DECLARATIONS ==================================

def simul(U,v,u,I):
    N = np.linalg.inv(I-U)
    p1 = np.matmul(v,N)
    mu = np.matmul(p1,u).tolist()
    p1 = np.matmul(v,N)
    p2 = np.matmul(2*N-I,u)
    p3 = np.matmul(p1,p2)
    sigma = p3-mu*mu
    sigma = sigma[0][0]
    return mu,sigma
print(’ ’)
print(’Sample sizes =>’,n)
I = np.identity(len(U))
u= np.array([np.transpose(np.ones(len(U)))])
print(’ ’)
print(’Matrix U:’)
print(’ ’)
for i in U:
print(’ ’)
MU = []

for s in range(nsim):
    Uest = []
    row = 0
    for i in U:
        x = (np.random.multinomial(n[row], p, size=1)/n[row]).tolist()
        row += 1
        mu,sigma = simul(Uest,v,u,I)
avsigma = np.mean(SIGMA)
varmu = np.var(MU)
EL = np.mean(MU)
VL = avsigma + varmu
print(’Expectation => ’,EL)
print(’Variance    => ’,VL)
print(’Standar dev.=> ’,np.sqrt(VL))
print(’ ’)
print(’Comparison: (no simulations, only expression (1) in draft ’)
mu,sigma = simul(U,v,u,I)
print(’Expectation => ’,mu)
print(’Variance    => ’,sigma)
print(’Standar dev.=> ’,np.sqrt(sigma))