Molecular Computing for Markov Chains

by   Chuan Zhang, et al.

In this paper, it is presented a methodology for implementing arbitrarily constructed time-homogenous Markov chains with biochemical systems. Not only discrete but also continuous-time Markov chains are allowed to be computed. By employing chemical reaction networks (CRNs) as a programmable language, molecular concentrations serve to denote both input and output values. One reaction network is elaborately designed for each chain. The evolution of species' concentrations over time well matches the transient solutions of the target continuous-time Markov chain, while equilibrium concentrations can indicate the steady state probabilities. Additionally, second-order Markov chains are considered for implementation, with bimolecular reactions rather that unary ones. An original scheme is put forward to compile unimolecular systems to DNA strand displacement reactions for the sake of future physical implementations. Deterministic, stochastic and DNA simulations are provided to enhance correctness, validity and feasibility.



There are no comments yet.


page 1

page 2

page 3

page 4


Markov chain aggregation and its application to rule-based modelling

Rule-based modelling allows to represent molecular interactions in a com...

Algorithmic Randomness in Continuous-Time Markov Chains

In this paper we develop the elements of the theory of algorithmic rando...

Generalized Parallel Tempering on Bayesian Inverse Problems

In the current work we present two generalizations of the Parallel Tempe...

Variance Reduction with Array-RQMC for Tau-Leaping Simulation of Stochastic Biological and Chemical Reaction Networks

We explore the use of Array-RQMC, a randomized quasi-Monte Carlo method ...

Scaling up Continuous-Time Markov Chains Helps Resolve Underspecification

Modeling the time evolution of discrete sets of items (e.g., genetic mut...

Continuous-time Markov chain as a generic trait-based evolutionary model

More than ever, today we are left with the abundance of molecular data o...

Infinity Learning: Learning Markov Chains from Aggregate Steady-State Observations

We consider the task of learning a parametric Continuous Time Markov Cha...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

By far, the exploitation and application of traditional computing equipment, such as silicon-based devices, has reached its peak. This urges the need of new material for possibly better computation performance or different application scenarios. Capable of exhibiting abundant dynamic behaviors, chemical reaction networks (CRNs) turn out to be a programmable language, prompting molecular scale material to become a highly promising candidate. As a parallel system in nature, CRNs possess the potential to handle large-scale and sophisticated computations. The past few decades have seen a groundswell of interest in molecular computing no matter concerning academy or industry 1, 2, 3, 4, with scientists trying to reveal the natural programmability of CRNs. A wealth of research is of primary interest in exploring the potential computational power of biological molecules by implementing digital logic, signal processing and functions 5, 6, 7, 8, 9, 10, 11, 12. Some other researchers are inclined to biochemically address computationally intractable and complex problems 13, 14, 15, 16. Even more remarkable works 17, 18, 19, 20, 21, 22, strongly dig out and prove the Turing-universal quality of chemical reaction networks.

Over the past five decades, these works 23, 24, 25 have been pursuing to building stochastic models for chemical kinetics, among which Markov chains play an important role. In the special field of DNA, Kannan26 utilizes Markov chains to provide statistical analysis of genome data. While stochastic processes that describe existing chemical systems have been systematically established, the inverse problem of computing stochastic networks by molecular reactions remains unsolved. Only a few people have considered this question in spite of so many extraordinary studies on molecular computing.

Apart from application in chemistry, Markov chains have been successfully applied to a wide range of areas such as digital communications, social networks, finance, and sports. Thus, our work anticipates a main focus on Markov chain related molecular computation. In fact, Cardon 16 and Salehi 15

have already challenged this topic: estimating the steady state distribution of any discrete-time Markov chain (DTMC) by DNA computing. Exactly belonging to the realm of molecular computing, such idea is greatly updated and innovative. In Cardon’s paper

16, DNA strands are used to represent Markov chains’ vertexes and edges directly, while in Salehi’s 15, hypothetical reactions are firstly designed. Besides the stationary behavior, the transient behavior—-step transition probabilities of DTMC, is well synthesized27. Unfortunately, none of the aforementioned approaches have made allowance for continuous-time Markov chains (CTMC) or higher-order Markov chains, of which our real life is a closer archetype.

Therefore, this paper attempts to tackle the issue from a more general standpoint. A straightforward and elegant way is proposed for designing CRNs with the functionality of computing not only DTMC but also CTMC and second-order Markov chains. Similar to Salehi15, each state is modeled by a unique molecular type. Instead of utilizing control molecules to regulate transitions as in paper15

, we model state transitions by various rate constants to reduce the number of needed molecular species and for convenience of DNA implementation. Hence, unimolecular reactions are designed for first-order Markov chains and bimolecular reactions serve to compute second-order ones. Different from electronic systems, molecular systems are usually designed with desired results indicated by concentrations as opposed to voltage. And as such, in our methodology, input and output values, which are a Markov chain’s initial distribution and steady state probabilities respectively, are both represented by molecular concentrations. Besides, from simulation results, transient solutions of continuous-time Markov chains can be creditably predicted by the evolution of various species’ concentrations over time. Both deterministic and stochastic simulations are provided to validate accuracy. Ordinary differential equations (ODEs) analysis is given for CTMC to prove infallibility on the theoretical level.

It should be noted that any chemical network in this paper is hypothetically shaped. With appropriate structure design, such an abstract set of reactions is said to be able to compute, or namely, simulate Markov processes. Nevertheless, some kind of physical substrate, such as DNAs or proteins, is required to emulate the system. In 2010, Soloveichik 28 constructed systems of DNA molecules that could closely approximate the dynamic behavior of arbitrary uni- or bimolecular chemical networks, which endowed this purely conjectural computing method with meaningfulness. In this paper, an original DNA method is proposed, inspired by Soloveichik, for implementing any unimolecular network with only one product in each reaction. Bimolecular networks for second-order Markov chains are ought to be compiled to DNA strand displacement reactions as designed in article 28.

Notations in this paper are listed below for clearer reference.


Symbol Definition Symbol Definition
number of molecules of molecular species , reaction rate constant,
number of molecules of molecular species reaction rate constant of the th reaction,
at time , reaction parameter,
numbers of molecules of each species, reaction parameter of the th reaction,
numbers of molecules of each species at time ,

the vector whose

th component is ,
initial concentration of molecular species , the vector whose th component is ,
concentration of molecular species at time , nonnegative integers,
initial concentration of each molecular species, the probability of event occurring,
concentration of each molecular species at time the information about the system that is
volume of the system, available at time .


Table 1: Notations in This Paper.

2 Preliminaries

Stochastic and deterministic models are two most common models for describing chemical reaction networks. Preliminaries are given below for preparing simulations and explaining the novelty of our work.

2.1 Deterministic Model Versus Stochastic Model

According to deterministic mass action kinetics 25, 29, 30, a set of ODEs are derived to determine the concentration of each molecular type at transient time in the system. Species concentrations are solutions to ODEs, thus are continuous, single-valued functions of time. This model is also named as ordinary differential equation model. Generally, consider a network of reactions involving chemical species, in Eq. (1), where are nonnegative integers.


ODEs in Eq. (2) are used to give the time evolution of the system. is the reaction rate constant of the th reaction. is the concentration of molecular species at time . in Eq. (2) is defined in Eq. (3). As soon as the ODEs are solved, the output of the chemical reaction system can be uniquely determined.


When it comes to stochastic models 31, 25, consider a very simple reaction: . Gillespie 31 points out the probability that it will occur somewhere inside in the next infinitesimal time interval is given by: , where is reaction parameter and . stands for the molecule number of at time and stands for that of . Similarly, Anderson 25 assumes the same probability, taking no account of the volume of the system , as in Eq. (4). is the condition of the available system information at time . In Anderson’s work, he also models the concentrations of a reaction network as complex random processes composed of Poisson processes.


With this inherent random property of chemical reaction system, Gillespie puts forward a simulation algorithm based on Monte Carlo techniques. Note that for the Gillespie algorithm to be applicable, the number of reactant species for each reaction cannot exceed three.

2.2 Comparison

According to Gillespie31, the mathematical relationship between and is that , which is self-evident. Kurtz 32 points out the relationship between the two models that in certain special cases and more complex systems, the deterministic model is the infinite volume limit of the stochastic one. This implies that the deterministic model is less accurate than the stochastic model when reactions occur in small compartments. The stochastic one takes account of fluctuations and correlations, providing better simulation for reality.

If expressed as a stochastic model as mentioned above, a chemical reaction network itself is a random process. When randomness is inherent to chemical reactions, the difference between building stochastic models for CRNs and molecular computation for stochastic problems needs illustrating in case of confusion. When building stochastic models, mathematical theories are utilized to analyze natural networks. In detail, the random variables are concentrations of each molecular type and there may be a multi-dimensional state space of different concentration values. When we solve stochastic problems using molecular reactions, these reactions are expected to express solutions in some way. For instance, in this paper, probability distributions are conveyed by concentrations. The random variables and state space depend on the particular case to be considered. The motivation as well as the Markov structure is entirely different. The comparison is summarized in Table



Aspects Stochastic Models for CRNs Molecular Computation for Stochastic Problems
Random Variables quantity of each molecule type events
State Space different molecule numbers different possible events
Simulation Model stochastic stochastic or deterministic


Table 2: Comparison between Building Stochastic Models for CRNs and Molecular Computation for Stochastic Problems.

3 Methodology

3.1 Discrete-Time Markov Chains

Several essential concepts regarding Markov chains 33 need to be specified in the first place as follows.

Definition 1

A given stochastic process at the consecutive points of observation constitutes a DTMC if the following relation, that is, the Markov property, holds for all and all :


In the homogeneous case, the transition probability from state to is independent of time and is defined as: .The transition matrix . Vector stands for the state probabilities at time . The initial probability vector is . As , the probability vector that converges is called steady state probability vector.

3.1.1 Example

Consider such a gambler’s ruin problem 34 referred as the probability of winning in an unfavorable game. Suppose that the probability that gambler will win one dollar on any given play is . Suppose also that the initial fortune of gambler is dollars and the initial fortune of gambler is just one dollar. We need to determine the probability that gambler wins one dollar from gambler before gambler wins dollars from gambler .

The required probability is given by Eq. (6) through mathematical analysis:


This problem is considered as a DTMC illustrated in Fig. 1. There are states , with indicating that gambler holds dollars. From the state transition diagram, states may jump to the previous or next state, while states and are absorbing. Transition probabilities are given by the probabilities of winning and losing. Utilizing this transition property, we try to model it by a chemical reaction network, with each molecular species representing each state as firstly proposed by Salehi 15. In that each reaction has a rate constant and this influences the probability of reaction occurring to some degree as in Eq. (4), we endeavour to harness this attribute and map Markov chain’s transition probabilities to reaction rate constants. Then the one-to-one correspondence between state transitions and reactions is established. The finished network is shown in Eq. (7).

Figure 1: The probability of winning in an unfavorable game.

To obtain the steady state distribution, the Markov chain needs to be assigned an initial distribution. As molecular concentrations are desired to be the system’s indicators, the concentration of each molecular species is initialized with the corresponding state’s initial probability. In this problem, gambler holds dollars in the beginning, therefore . Accordingly, . When well prepared, the system begins to react and all required to be done is waiting until it reaches an equilibrium state, helping processing the computation with the chemical potential energy. Then the final concentrations are the outputs: steady state probabilities. Simulations will be given.

Remark 1

As is known, any chemical reaction network itself is a Markov chain, thus it is easily misunderstood that the mapping is a self-existed conclusion. Nevertheless, constructing Markov chain model for a chemical system, the state space is usually determined by concentrations or molecule numbers instead of molecule species. For example, consider the reaction and suppose that initially there are two molecules in total and they can be either or . The transition is mutual but the overall molecule number remains two. Obviously, or is a Markov chain with state space . can also be a Markov chain with state space . In real DNA reactions, the concentration is always scaled to or , which means the molecular number is much larger than two. Therefore, the Markov chain model for Eq. (7) is apparently not that in Fig. 1. The delicately designed structure of this network, happens to be capable of modeling this chain’s computation when on a large molecular scale and giving a relatively deterministic result.

3.1.2 Design Concept

As the gambler’s ruin problem is explained in detail, the entire design approach gradually becomes clear and easy to understand. The conclusive framework for the design concept is depicted in Fig. 2. Given a target Markov chain, each state is modeled by a unique molecular type. Unimolecular reactions are constructed to implement state transitions, with one type of molecule changing into another. Input concentrations are initialized to activate the system and output concentrations provide expected stationary distribution.

Figure 2: Framework for the design concept.

The fact that jumps between two states may exist at the same time causes the derived reactions to be reversible. Given that there are states in all, the reactions to implement this target DTMC are shown in Eq. (8).


In summary, the complete method includes steps: Step 1) Model each state by a molecular type . Step 2) Model each transition probability by rate constant . Step 3) Model all state transitions by reactions . Step 4) Set the values of rate constants proportional to the corresponding transition probabilities . Step 5) Set the initial concentrations of molecular types according to probability . Step 6) If the steady state solution exists, compute the steady state solution of the DTMC by the final concentration of species .

Complexity Analysis: In our approach, control molecules are not required compared to Salehi’s 11, thus the cost of molecular types is saved. Since each state is represented by one unique molecular type, the number of molecular types equals the number of states . The number of reactions depends on the particular case. Specifically, if the transitions between two states exist at the same time, one reversible reaction functions to realize them. As one reaction can compute either one transition or two transitions, the number of reactions equals or is smaller than the number of all transitions . If there are no mutual transitions, all reactions are not reversible and the reaction number meets its maximum value .

3.2 Continuous-Time Markov Chains

After the DTMC is well synthesized, it is excitingly found that the approach can also be extended to implement CTMC. Some detailed mathematical descriptions are provided here for further formal analysis.

Definition 2

A given stochastic process constitutes a continuous-time Markov chain if for arbitrary , with , , and (state space of this chain), the following relation holds:


stands for the probability of state at any instant of time . Vector stands for the state probabilities at any instant of time . Unlike the discrete-time case, the state probabilities of a CTMC cannot be computed easily by transition probabilities. Therefore, we define the instantaneous transition rates () of the CTMC traveling from state to state . For all states with , we define If the limits do exist, it is clear that at any instant of time , the following reaction holds:


Eq. (10) can be modified as:


In the time-homogeneous case, with time-independent transition rates and the system of differential Eq. (12), we describe a CTMC:


The infinitesimal generator matrix . If existing for a given CTMC, the steady state probabilities are independent of time and we immediately get:

As specified by the mathematical definition, the only difference between CTMC and DTMC is that for any instant of time, CTMC has a probability distribution , which is called the transient solution. The state space is still discrete here. The proposed method will be able to compute not only the steady state distribution but also the transient solution of an arbitrary CTMC.

3.2.1 Example

Two common cases in queueing theory 33 are used to exemplify the computation of transient solution and steady state solution, respectively.

A Pure Birth Process: Consider the infinite state CTMC depicted in Fig. 3 representing a pure birth process with constant birth rate . The only possible transitions are from state to state with rate . Note that this is a nonirreducible Markov chain for any finite value of , so the steady state solution does not exist.

Figure 3: A pure birth process.

According to the transition graph, the only difference from DTMC is that the transition rate is no longer a value of probability, while the structure is analogous. Hence, the technique can continue to be used, with rate constants modeling transition rates as opposed to transition probabilities. Given that only finite states are feasible in CRNs, we implement this CTMC with a -state one, the first transient solutions of which are exactly the same as the pure birth process. The reactions are presented in Eq. (13). The setting of initial concentrations follows the same principle as DTMC. Here, the time evolution of concentrations ideally resembles the transient solutions. Please refer to simulation for details.


A Birth and Death Process: When it comes to steady state probabilities, another example serves better to verify our approach. A birth and death process is a Markov chain where transitions are allowed only between neighboring states. A one-dimensional birth-death process is shown in Fig. 4. In particular, a birth-death process with a constant birth rate (arrival rate) and a constant death rate (service rate) is called an Queue. This case is used to illustrate how to compute both transient and stationary solutions.

Figure 4: A birth-death process.

Similarly, only finite states are realizable in CRNs thus we could only get the approximate solutions. To implement such a CTMC, we have the reactions designed in Eq. (14).


3.2.2 Design Concept

As clarified by the two cases, it is only slightly different to implement a CTMC. The transition rates instead of probabilities are modeled by the rate constants. The designed reactions for implementing an arbitrary CTMC are shown in Eq. (15)


Such mapping can, to a great extent, model and implement the CTMC, computing not only the steady state probabilities but also the transient solutions. More specifically, the concentrations of the molecules at any instant time of are the same as the probability distribution of the CTMC at time . In addition, the final concentrations of are the steady state probabilities of the target CTMC. To sum up, the entire procedure contains steps: Step 1) Model each state by a molecular type . Step 2) Model each transition rate by reaction rate constant . Step 3) Model all state transitions by reactions when . Step 4) Set the values of rate constants proportional to the transition rates . Step 5) Set the initial concentrations of molecular types according to probability . Step 6) Compute the transient solution of the CTMC by the concentrations of at any instant of time . Step 7) If the steady state solution exists, compute the steady state solution of the CTMC by the equilibrium concentrations of . The complexity is the same as DTMC thus omitted here.

3.2.3 ODE Analysis

The correctness of our methodology could be predicted to a great extent by simulation results. However, beyond simulation, the congruence between the ODE model of our designed network and that of the corresponding CTMC can prove the validity in a mathematical way.

Proof 1

According to deterministic mass action, the ODEs of Eq. (15)’s network can be simply derived in Eq. (16).


According to Eq. (12), the differential system to describe the target Markov chain is:


Bringing Eq. (10) into Eq. (17), we have:


It can be found that the form of Eq. (16) and Eq. (18) mirrors each other, thus proving that the solutions of the designed CRN are the same as the transient solutions of the CTMC. Therefore, the time evolution of concentrations can well reflect the transient probabilities at any instant of time. In addition, we define the final concentration of a given molecular type as the concentration of it when verges to . And as such, the final concentrations of are the steady state probabilities of the target CTMC.

3.3 Two-Order Markov Chains

In the previous sections, the Markov processes’ transition probabilities depend only on the current state. Such chains are called first-order Markov chains. For the higher-order Markov chains, the transition probabilities depend on the current state and some previous states35. In point of fact, any n-order Markov chain can be expressed as a first-order chain with state space , where is the state space of the original chain. Consequently, higher-order Markov chains can be implemented by the approach specified above. Unfortunately, this would exponentially increase the complexity. This problem may be resolved by increasing the dimension of CRNs instead of state space. However, there would exist a trade off between complexity and accuracy. Here we make use of bimolecular reactions to implement second-order Markov chains, where the transition probabilities depend on the latest two states—the current state and the previous state as shown in Eq. (19).


3.3.1 Example

Higher-order Markov chains are usually used to predict weather because the future weather trend considerably depends on the previous records. Make allowance for such a simple model: tomorrow’s weather depends on today and yesterday. Transition probabilities are given in Eq. (20), where represent day, day, day and represent sunny and rainy.

The state space is , clearly. As shown, if the first day and the second day are both sunny, there is a chance that the third day is continuously sunny. If expressed as a first-order Markov chain, the state space will become and the state transition diagram can be derived as in Fig. 5.

Figure 5: Model for weather prediction.

Instead of modeling each node by one molecular type, we model each state by one molecular type just as we do previously, so only two types of molecule are needed in all. To map each transition into one chemical reaction, it is found from the diagram that either node before or after the transition contains two states, resulting in the reactions’ being bimolecular as shown in the right part of Fig. 5. For each reaction, reactants reflect the two previous states before transition and the two products are states after transition. Four transitions are drawn in dashed lines because the composition of states does not change, unable to form a new reaction. Finally, four irreversible reactions are derived and then simplified into two reversible reactions. The value assignment of rate constants and initial concentrations follows the same principle as in first-order chains. The equilibrium concentrations are expected to compute the stationary distribution and help predict the weather in the long run.

3.3.2 Design Concept

Utilizing a -state diagram, the framework for our design concept is summarized in Fig. 6. Each transition produces one corresponding reaction, which has two reactants and two products. The reactants and products share one common molecule. Such implementation encourages the transition into a new state based on the two previous states, with the justified probability. Invalid transitions are unable to add reactions to the network.

Figure 6: Framework for design concept of second-order Markov chains.

The final approach for second-order Markov chains is concluded by steps: Step 1) Model each state by a molecular type. Step 2) Model each transition probability by one rate constant. Step 3) Model each state transition by one bimolecular reaction. Step 4) Exclude invalid transitions. Step 5) Set the values of rate constants proportional to the transition probabilities. Step 6) Set the initial concentrations of molecular types according to initial probabilities. Step 7) If the steady state solution exists, compute the steady state solution of the second-order Markov chain by the equilibrium concentrations of the corresponding molecules.

Complexity Analysis: If there are states in all, molecular types are needed. The number of reactions equals or is smaller than the number of valid transitions . When all transitions are possible, reaches its maximal value . If the second-order chain is implemented by the approach of first-order chain, molecular types are required and ’s maximal value becomes . Hence, this approach is increasingly efficient as rises.

4 Simulation Results

In order to ensure the DNA implementation is applicable 28, two constraints should be satisfied: maximal second-order rate constants are about /M/s; maximum concentrations are on the order of M.

4.1 Deterministic Simulation

Since dynamics of a CRN endowed with mass action kinetics can be well demonstrated by ODEs, ODE-based simulation is usually a good solution to synthesize a CRN, offering a smooth output graph.

Gambler’s Ruin Problem: As designed above, the unscaled rate constants are and and the unscaled initial concentration is . To add feasibility, the scaled rates are chosen to be /s and /s here and the scaled concentration is M. The simulation result is shown in Fig. 7.

Figure 7: ODE simulation result for gambler’s ruin problem.

From the graph, the concentrations approximate the accurate result and along the time line, meaning the probability of gambler winning one dollar ends up . After hours, the error is less than .

Pure Birth Process: If initial probabilities and for , we can get a closed-form solution for each transient state probability . If the parameter is considered to be , the graph is easily obtained by means of MATLAB as depicted in Fig. 8.

Figure 8: Selected transient solutions for a pure birth process with .

The initial concentration of is unscaled and those of other molecules are . All the rate constants are unscaled when . The scaled concentration and rate are M and /s. Simulation result is illustrated in Fig. 9. Comparing the transient solution graphs of CTMC and the simulation graphs of CRNs, they resemble each other perfectly, realizing the desired functionality smoothly.

Figure 9: ODE simulation result for a pure birth process with .

Birth and Death Process: With the initial state probabilities and , the steady state probabilities of the system being empty can be obtained that . We specify this solution for in Table 3. Clearly from Table 3, the steady state probability of is , the steady state probability of is and the same is true of the rest of the states.



Table 3: The solution for in an Queue.

The initial concentration of is unscaled . The rate constants are unscaled and . The scaled concentration and rates should be M, /s and /s. The simulation result is shown in Fig. 10. Compared with Table 3, the steady state probabilities are computed correctly by observing the final molecular concentrations. The maximal error gradually becomes less along with time as shown in Table 4.

Figure 10: ODE simulation result for a birth-death process.




Table 4: Error for in an Queue.

Weather Prediction: If sunny is selected as the initial state, the initial concentration of is unscaled . According to Fig. 5, the unscaled rate constants are as defined by transition probabilities. For simulations, the scaled concentration becomes M and the scaled rates are /M/s, /M/s, /M/s, /M/s. The simulation graph Fig. 11 shows that the steady state distribution is as the equilibrium concentrations divided by the initial concentration are and . The error is at hour and at hour .

Figure 11: ODE simulation result for weather prediction.

4.2 Stochastic Simulation

Gillespie algorithm 31 is used for stochastic simulation in this paper. Different from ODE simulation, gillespie simulation gives fluctuating curves as opposed to smooth ones. The result is indeterminate thus may deviate from the expected value. Nevertheless, the error can be reduced as the concentration increases. All initial numbers of molecules are selected to be here for relatively accurate outputs. According to the simulation results in Fig. 12,13,14,15, concentrations vary above or below precise values in a limited range, effectively estimating the required outputs.

Figure 12: Gillespie simulation result for gambler’s ruin problem.
Figure 13: Gillespie simulation result for a pure birth process.
Figure 14: Gillespie simulation result for a birth-death process.
Figure 15: Gillespie simulation result for weather prediction.

5 DNA Implementation

The methodology for an abstract set of molecular reactions is designed above. The engineered biochemical systems need to be mapped to specific DNA reactions to obtain real meaningfulness. In 2010, Soloveichik 28 managed to contrive a DNA strategy for arbitrary hypothetical CRNs with satisfactory performance. In his method, each unimolecular reaction is compiled to two DNA strand displacement reactions and each bimolecular one is compiled to three. Buffering modules are additionally needed for bimolecular systems.

5.1 Unimolecular Networks

By carefully observing the network we design for first-order Markov chains, each reaction has one reactant and one product and the entire system contains only unimolecular reactions. Therefore, if Soloveichik’s method is directly used here, it will be wasteful of DNA resources. Borrowing some clever ideas employed by Soloveichik, we devise a new DNA method for this typical unimolecular system as in Fig. 16. Each formal species is modeled by a kind of DNA strand named signal species, with the species identifier defined by one toehold and one domain. Each reaction is implemented by one DNA strand displacement reaction, with one signal species reacting with the auxiliary species to produce another signal species. The initial concentration of auxiliary species is and it is required that . is the rate constant of the DNA reaction and it is controlled by the binding energy of domains and , as is not a full complement of . To ideally approximate the ODE kinetics, it should be satisfied that .

Figure 16: DNA module for unimolecular networks.

Utilizing a reaction network with three molecular species, a more specific mapping is shown in Fig. 17. Given that different reactions may produce the same hypothetical species, the history domain “” of each signal species is indeterminate. However, each signal species can be uniquely identified by the species identifier.

Figure 17: DNA implementation of first-order Markov chains.
Remark 2

By changing the length and sequence composition of a toehold domain , which is not a full complement of , the binding strength and in turn the rate constant can be varied. The rate constants can then be controlled over orders of magnitude 36, 37. However, toeholds are short and have limited sequences. Although distributed over a wide range, not all exact rate constants can be achieved this way. To tackle this problem, concentrations of auxiliary species can be adjusted to fine-tune rate constants 28.

DNA simulations for the three examples with first-order chains are illustrated in Fig. 18,19,20. Note that is set as M. DNA kinetics are drawn in dashed lines in contrast with ideal ODE kinetics. Compared to the ideal kinetic behaviors, those presented by DNAs are highly adequate.

Figure 18: DNA simulation result for gambler’s ruin problem.
Figure 19: DNA simulation result for a pure birth process.
Figure 20: DNA simulation result for a birth-death process.

5.2 Bimolecular Networks

Reactions employed to realize second-order Markov chains have two reactants, thus the DNA approach proposed above is no longer effective. Technique proposed by Soloveichik 28 is directly used here for simulation, where the species identifier is composed of one domain and two toeholds. The result of weather prediction is displayed in Fig. 21. Notice that the initial concentration in the DNA system is M in that the buffering-scaling factor .

Figure 21: DNA simulation result for weather prediction.

6 Conclusions

In this paper, conjectural chemical reaction networks are shaped for computation of arbitrary time-homogeneous Markov chains, including DTMC, CTMC and second-order Markov chains. Not only steady state probabilities but also transient solutions are well synthesized. An original DNA method is proposed for implementing any unimolecular network with only one product in each reaction. Deterministic, stochastic and DNA simulations are provided to enhance correctness, validity and feasibility.


  • Bennett 1982 Bennett, C. H. (1982) The thermodynamics of computation a review. International Journal of Theoretical Physics 21, 905–940.
  • Stemmer 1995 Stemmer, W. P. (1995) The evolution of molecular computation. Science 270, 1510–1511.
  • Păun and Rozenberg 2002 Păun, G., and Rozenberg, G. (2002) A guide to membrane computing. Theoretical Computer Science 287, 73–100.
  • Lund et al. 2010 Lund, K., Manzo, A. J., Dabby, N., Michelotti, N., Johnson-Buck, A., Nangreave, J., Taylor, S., Pei, R., Stojanovic, M. N., Walter, N. G., Winfree, E., and Yan, H. (2010) Molecular robots guided by prescriptive landscapes. Nature 465, 206–210.
  • Chen et al. 2014 Chen, H.-L., Doty, D., and Soloveichik, D. (2014) Deterministic function computation with chemical reaction networks. Natural Computing 13, 517–534.
  • Jiang et al. 2013 Jiang, H., Riedel, M. D., and Parhi, K. K. Digital logic with molecular reactions. Proc. IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 2013; pp 721–727.
  • Jiang et al. 2011 Jiang, H., Riedel, M., and Parhi, K. Synchronous sequential computation with molecular reactions. Proceedings of the Design Automation Conference. 2011; pp 836–841.
  • Kharam et al. 2011 Kharam, A. P., Jiang, H., Riedel, M. D., and Parhi, K. Binary counting with chemical reactions. Proc. Pacific Symposium on Biocomputing. 2011; pp 302–313.
  • Salehi et al. 2014 Salehi, S. A., Riedel, M. D., and Parhi, K. K. Asynchronous discrete-time signal processing with molecular reactions. Proc. IEEE Asilomar Conference on Signals, Systems and Computers. 2014; pp 1767–1772.
  • Jiang et al. 2013 Jiang, H., Salehi, S. A., Riedel, M. D., and Parhi, K. K. (2013) Discrete-time signal processing with DNA. ACS Synthetic Biology 2, 245–254.
  • Salehi et al. 2015 Salehi, S. A., Jiang, H., Riedel, M. D., and Parhi, K. K. (2015) Molecular Sensing and Computing Systems. IEEE Transactions on Molecular, Biological and Multi-Scale Communications 1, 249–264.
  • Salehi et al. 2016 Salehi, S. A., Parhi, K. K., and Riedel, M. D. (2016) Chemical reaction networks for computing polynomials. ACS Synthetic Biology 6, 76–83.
  • Adleman 1994 Adleman, L. M. (1994) Molecular computation of solutions to combinatorial problems. Science 266, 1021.
  • Ouyang et al. 1997 Ouyang, Q., Kaplan, P. D., Liu, S., and Libchaber, A. (1997) DNA solution of the maximal clique problem. Science 278, 446–449.
  • Salehi et al. 2015 Salehi, S. A., Riedel, M. D., and Parhi, K. K. Markov chain computations using molecular reactions. Proc. IEEE International Conference on Digital Signal Processing (DSP). 2015; pp 689–693.
  • Cardona et al. 2005 Cardona, M., Colomer, M., Conde, J., Miret, J., Miró, J., and Zaragoza, A. (2005) Markov chains: Computing limit existence and approximations with DNA. Biosystems 81, 261–266.
  • Berry and Boudol 1992 Berry, G., and Boudol, G. (1992) The chemical abstract machine. Theoretical Computer Science 96, 217–248.
  • Rothemund 1995

    Rothemund, P. W. K. (1995) A DNA and restriction enzyme implementation of turing machines.

    DNA Based Computers 27, 75–119.
  • Magnasco 1997 Magnasco, M. O. (1997) Chemical kinetics is Turing universal. Physical Review Letters 78, 1190.
  • Liekens and Fernando 2007 Liekens, A., and Fernando, C. (2007) Turing complete catalytic particle computers. Advances in Artificial life 1202–1211.
  • Soloveichik et al. 2008 Soloveichik, D., Cook, M., Winfree, E., and Bruck, J. (2008) Computation with finite stochastic chemical reaction networks. Natural Computing 7, 615–633.
  • Hjelmfelt et al. 1991

    Hjelmfelt, A., Weinberger, E. D., and Ross, J. (1991) Chemical implementation of neural networks and Turing machines.

    Proceedings of the National Academy of Sciences 88, 10983–10987.
  • McQuarrie 1967 McQuarrie, D. A. (1967) Stochastic approach to chemical kinetics. Journal of applied probability 4, 413–478.
  • Van Kampen 1995 Van Kampen, N. G. Stochastic Processes in Physics and Chemistry; Elsevier, 1995.
  • Anderson and Kurtz 2011 Anderson, D. F., and Kurtz, T. G. In Design and Analysis of Biomolecular Circuits: Engineering Approaches to Systems and Synthetic Biology; Koeppl, H., Setti, G., di Bernardo, M., and Densmore, D., Eds.; Springer New York: New York, NY, 2011; pp 3–42.
  • Kannan et al. 2007 Kannan, K. S., Vallinayagam, V., and Venkatesan, P. (2007) Markov Chain Monte Carlo Methods in Molecular Computing. IJISE
  • Shen et al. 2016

    Shen, Z., Zhang, C., Ge, L., Zhuang, Y., Yuan, B., and You, X. Synthesis of Probability Theory Based on Molecular Computation. Proc. IEEE International Workshop on Signal Processing Systems (SiPS). 2016; pp 27–32.

  • Soloveichik et al. 2010 Soloveichik, D., Seelig, G., and Winfree, E. (2010) DNA as a universal substrate for chemical kinetics. Proceedings of the National Academy of Sciences (PNAS) 107, 5393–5398.
  • Érdi and Tóth 1989 Érdi, P., and Tóth, J. Mathematical models of chemical reactions: Theory and applications of deterministic and stochastic models; Manchester University Press, 1989.
  • Horn and Jackson 1972 Horn, F., and Jackson, R. (1972) General mass action kinetics. Archive for rational mechanics and analysis 47, 81–116.
  • Gillespie 1976 Gillespie, D. T. (1976) A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of computational physics 22, 403–434.
  • Kurtz 1972 Kurtz, T. G. (1972) The relationship between stochastic and deterministic models for chemical reactions. The Journal of Chemical Physics 57, 2976–2978.
  • Bolch et al. 2006 Bolch, G., Greiner, S., de Meer, H., and Trivedi, K. S. Queueing networks and Markov chains: Modeling and performance evaluation with computer science applications; John Wiley & Sons, 2006.
  • DeGroot and Schervish 2012 DeGroot, M. H., and Schervish, M. J. Probability and Statistics; Addison-Wesley, 2012.
  • Ching et al. 2013 Ching, W.-K., Huang, X., Ng, M. K., and Siu, T.-K. Markov Chains; Springer, 2013; pp 141–176.
  • Yurke and Mills 2003 Yurke, B., and Mills, A. P. (2003) Using DNA to power nanostructures. Genetic Programming and Evolvable Machines 4, 111–122.
  • Zhang and Winfree 2009 Zhang, D. Y., and Winfree, E. (2009) Control of DNA strand displacement kinetics using toehold exchange. Journal of the American Chemical Society 131, 17303–17314.