In an ergodic Markov chain with states, first passage time is defined as the time to reach a particular set of states for the first time, starting from a state distribution . The first passage time is important, for instance, in Markov Population Models to calculate the longevity of individuals as well as other relevant life history traits (LHTs), like generation time and the basic reproductive number (Caswell, 2009).
In very few real life situations, the parameters of a Markov chain are known exactly, and when this occurs, is due mostly by model assumptions. For instance, consider a Markov chain with states corresponding to the outcome when rolling a dice, thus
and we assume the transition probabilities between statesand as by model, but most of the time, the parameters of the Markov Chain have inherent error, which is closely related to the sample size.
We can use , the fraction of transitions from state to out of as an estimate of , (Guttorp, 1995). We provide a simple methodology based in Monte Carlo simulations that incorporates sample size in our estimation of the first passage time in Markov chains.
To calculate the fist passage time from an initial state to a set of states , we can transform the Markov chain so that every state in is absorbing and then analyze the time to extinction. Assume a Markov chain has transient and absorbing states that is decomposed as:
where is a matrix containing the transitions between the transient states, is a matrix containing the transitions between transient states to absorbing states, is a matrix of zeros and
is the identity matrix of size. Then, starting from the initial distribution , the expected value and variance of the time to extinction of the process, , are:
If is unknown, its parameters are estimated through observation or experimentation, giving rise to a matrix , some random realization of matrix , thus, in the most common scenario, the moments in (1) are not calculated using but instead, with a particular realization .
Let be an estimate of the fundamental matrix . The Variance of the passage time can be written by conditioning on :
Row of has parameters that were estimated with , the observed fraction of transitions from state to each state. A sample of size from follows a multinomial distribution with parameters and , with expected value and variance-covariance matrix :
For details see Guttorp (1995), Theorem 2.16, p.65
We can sample from the multinomial distribution for each row of , that is, simulate and use it to calculate and thus the moments in (1). The average and variance over the simulations can be plugged in (2) to provide an estimate of the variance of the passage time.
We can thus summarize the Monte Carlo estimation as follows:
For every row in , simulate a vector of transitions from a multinomial distribution with parameters and , where is the total number of transitions observed from state to all other states. Let
Use the samples of the previous step to construct a matix .
Use the matrix to calculate and with (1) calculate and . Record these and as the outcome of the -th simulation.
Repeat from step 1.
The average of the ’s is used to estimate , while the variance of the ’s estimates . Adding these two yields an estimate of the variance of the Longevity, as given in (2).
Biologists use Markov chains to model the transition of individuals through different developmental stages. Death is an absorbing state. Due to the use of the recurrence
to express the change in the stage-specific composition of the population at two subsequent discrete generations, it is customary among biologists to express Markov chains using its transpose, which is called a Matrix Population Model, MPM, for details, see Caswell (2009), Hernandez-Suarez et al. (2019).
An example of transitions between four stages is depicted in Figure (1). In a given state , an individual can move to the next stage with probability (graduation), stay one more unit of time in that state with probability or die with probability .
In Hernandez-Suarez et al. (2019), a data set for the life cycle of the kissing bug Eratyrus mucronatus (Reduviidae, Triatominae) was introduced. There are seven stages: egg, nymph states 1-5 and the adult stage. Table (1) shows the estimates of each of the parameters and the sample size to estimate the parameters of each stage. For details in the estimation method, see Hernandez-Suarez et al. (2019).
|(*) Total number of transitions observed per stage.|
The estimated matrix for the (transposed) Markov Chain is:
3.1 Monte Carlo estimation
For every stage , the parameter space has expected value and variance-covariance matrix :
The difference between the calculated expected values and standard deviation of the longevity using (1) vs. those obtained using Monte Carlo methods are negligible, which is mainly due to the relatively large sample sizes for every stage in the study (see Table 1). In experimental biology, sample sizes may be large for most plants and invertebrates, but is more difficult for vertebrates or even mammals, where follow up is expensive or extremely difficult. Figure 2 shows that when the sample size was reduced to a tenth, the difference between both calculations becomes noticeable: there is an increase of in the standard deviation between both Monte Carlo simulations.
It must be noticed that in the simulations with sample size reduced, we only reduced the value of in eq. (3) but kept the same estimates of , and shown in Table 1. Since these values were achieved with the original sample size, we expect that the increase in variance caused by a reduction in sample size would be larger than the indicated in Fig. 2.
|Using equation (1)|
- Caswell (2009) Caswell, H. (2009) Stage, age and individual stochasticity in demography. Oikos, pp. 1763–1782.
- Guttorp (1995) Guttorp, P. (1995) Stochastic modeling of scientific data. Stochastic modeling. Chapman & Hall, London; New York, 1st edition.
- Hernandez-Suarez et al. (2019) Hernandez-Suarez, C., Medone, P., Castillo-Chavez, C. & Rabinovich, J. (2019) Building matrix population models when individuals are non-identifiable. Journal of theoretical biology, 460, 13–17.
- Iosifescu (1980) Iosifescu, M. (1980) Finite Markov processes and their applications. Dover Publications Inc., New York, USA.
5 Supplementary material
Python code to estimate mean and variance of transition time using multinomial distribution. (Performs simulations in about seconds with a GHz Intel Core i processor)
#!/usr/bin/env python3.6 # -*- coding: utf-8 -*- import numpy as np #================ BEGIN DECLARATIONS ================================== #Declaring number of simulations: nsim = 1000 #Declaring sample size for every state: n=[676,669,390,466,465,912,2570] #Declaring matrix U: U=[[478/676,Ψ139/676,Ψ0,Ψ0,Ψ0,Ψ0,Ψ0], [0,Ψ528/669,Ψ89/669,Ψ0,Ψ0,Ψ0,Ψ0], [0,Ψ0,Ψ301/390,Ψ74/390,Ψ0,Ψ0,Ψ0], [0,Ψ0,Ψ0,Ψ392/466,Ψ60/466,Ψ0,Ψ0], [0,Ψ0,Ψ0,Ψ0,Ψ405/465,Ψ59/465,Ψ0], [0,Ψ0,Ψ0,Ψ0,Ψ0,Ψ853/912,Ψ55/912], [0,Ψ0,Ψ0,Ψ0,Ψ0,Ψ0,Ψ2515/2570] ] #Declaring initial distribution: (start in first stage in this example) v= np.array([np.zeros(len(U))]) v = 1 #================ END DECLARATIONS ================================== def simul(U,v,u,I): N = np.linalg.inv(I-U) p1 = np.matmul(v,N) mu = np.matmul(p1,u).tolist() mu=mu p1 = np.matmul(v,N) p2 = np.matmul(2*N-I,u) p3 = np.matmul(p1,p2) sigma = p3-mu*mu sigma = sigma return mu,sigma print(’ ’) print(’Sample sizes =>’,n) I = np.identity(len(U)) u= np.array([np.transpose(np.ones(len(U)))]) u=np.transpose(u) print(’ ’) print(’Matrix U:’) print(’ ’) for i in U: print(i) print(’sum:’,sum(i)) print(’ ’) MU =  SIGMA= count=0 for s in range(nsim): Uest =  row = 0 for i in U: p=list(i) p.append(1-sum(i)) x = (np.random.multinomial(n[row], p, size=1)/n[row]).tolist() Uest.append(x[0:-1]) row += 1 try: mu,sigma = simul(Uest,v,u,I) MU.append(mu) SIGMA.append(sigma) except: pass avsigma = np.mean(SIGMA) varmu = np.var(MU) EL = np.mean(MU) VL = avsigma + varmu print(’RESULTS:’) print(’---------------------------------’) print(’Expectation => ’,EL) print(’Variance => ’,VL) print(’Standar dev.=> ’,np.sqrt(VL)) print(’ ’) print(’Comparison: (no simulations, only expression (1) in draft ’) mu,sigma = simul(U,v,u,I) print(’---------------------------------’) print(’Expectation => ’,mu) print(’Variance => ’,sigma) print(’Standar dev.=> ’,np.sqrt(sigma))