A Case Study on the Parametric Occurrence of Multiple Steady States

04/28/2017 ∙ by Russell Bradford, et al. ∙ University Lille 1: Sciences and Technologies University of Bath Université de Montpellier Joint Institute for Nuclear Research University of Bonn 0

We consider the problem of determining multiple steady states for positive real values in models of biological networks. Investigating the potential for these in models of the mitogen-activated protein kinases (MAPK) network has consumed considerable effort using special insights into the structure of corresponding models. Here we apply combinations of symbolic computation methods for mixed equality/inequality systems, specifically virtual substitution, lazy real triangularization and cylindrical algebraic decomposition. We determine multistationarity of an 11-dimensional MAPK network when numeric values are known for all but potentially one parameter. More precisely, our considered model has 11 equations in 11 variables and 19 parameters, 3 of which are of interest for symbolic treatment, and furthermore positivity conditions on all variables and parameters.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The occurrence of multiple steady states (multistationarity) has important consequences on the capacity of signaling pathways to process biological signals, even in its elementary form of two stable steady states (bistability). Bistable switches can act as memory circuits storing the information needed for later stages of processing [26]. The response of bistable signaling pathways shows hysteresis, namely dynamic and static lags between input and output. Because of hysteresis one can have, at the same time, a sharp binary response and protection against chatter noise.

Bistability of signaling usually occurs as a result of activation of upstream signaling proteins by downstream components [3]. A different mechanism for producing bistability in signaling pathways was proposed by Markevich et al. [19]. In this mechanism bistability can be caused by multiple phosphorylation/dephosphorylation cycles that share enzymes. A simple, two-step phosphorylation/dephosphorylation cycle is capable of ultrasensitivity, a form of all or nothing response with no hysteresis (Goldbeter–Koshland mechanism). In multiple phosphorylation/dephosphorylation cycles, enzyme sharing provides competitive interactions and positive feedback that ultimately leads to bistability.

Algorithmically the task is to find the positive real solutions of a parameterized system of polynomial or rational systems, since the dynamics of the network is given by polynomial systems (arising from mass action kinetics) or rational functions (arising in signaling networks when some intermediates of the reaction mechanisms are reduced). Due to the high computational complexity of this task [13] considerable work has been done to use specific properties of networks and to investigate the potential of multistationarity of a biological network out of the network structure. This only determines whether or not there exist rate constants allowing multiple steady states, instead of coming up with a semi-algebraic description of the range of parameters yielding this property. These approaches can be traced back to the origins of Feinberg’s Chemical Reaction Network Theory (CRNT) whose main result is that networks of deficiency 0 have a unique positive steady state for all rate constants [12, 8]. We refer to [6, 20, 16] for the use of CRNT and other graph theoretic methods to determine potential existence of multiple positive steady states, with [17] giving a survey.

Given a bistable mechanism it is also important to compute the bistability domains in parameter space, namely the parameter values for which there is more than one stable steady state. The size of bistability domains gives the spread of the hysteresis and quantifies the robustness of the switches. The work of Wang and Xia [24] is relevant here: they used symbolic computation tools, including cylindrical algebraic decomposition as we do below, to determine the number of steady states and their stability for several systems. They reported results up to a 5-dimensional system using specified parameter values, but their method is extensible to parametric questions. Higher-dimensional systems were studied using sign conditions on the coefficients of the characteristic polynomial of the Jacobian. In some cases these guarantee uniqueness of the steady state [7].

In this paper we use an 11-dimensional model of a mitogen-activated protein kinases (MAPK) cascade [19] as a case study to investigate properties of the system using algorithmic methods towards the goal of semi-algebraic descriptions of parameter regions for which multiple positive steady states exist.

2 The MAPK Network

The model of the MAPK cascade we are investigating can be found in the Biomodels database [18].111www.ebi.ac.uk/biomodels-main/BIOMD0000000026 We have renamed the species names to , …,  and the rate constants to , …,  to facilitate reading:

(1)

The Biomodels database also gives us meaningful values for the rate constants:

(2)

Some of these values are measured and some are well-educated guesses. For the purpose of our study we assume they are suitable.

We add three linear conservation constraints introducing three further constant parameters , , :

(3)

Computations to produce these in MathWorks SimBiology use the left-null space of the stoichiometric matrix under positivity conditions, see for example [21].

Meaningful values for are harder to obtain than the constants in (2

). The following are some realistic values estimated by ourselves on the basis of our understanding of the biological model:

(4)

The long-term goal of our research is to treat all three of these together parametrically, although in the present work we focus on situations with one free-parameter.

The steady state problem for the MAPK cascade can now be formulated as a real algebraic problem. That is, we replace the left hand sides of all equations in (1) with . This together with the equations in (3) yields an algebraic system with polynomials in

All entities in our model are strictly positive, which yields in addition a system

establishing a side condition on the solutions of that all variables and parameters of be positive. In terms of first-order logic our specification of and yields a quantifier-free Tarski formula

(5)

The estimations for the rate constants in (2) formally establish a substitution rule , which can be applied to , , or in postfix notation.

2.1 Symbolic Determination of Occurrences of Multiple Steady States

In this section we are going to analyze the system for multiple positive steady states. As we will not include a priori information about the stability of the fixed points, we do not only have to consider (at least) two stable fixed points but also unstable fixed points, i.e., we investigate the existence of at least three different roots of for given choices of parameters.

We present two investigations: one using the Redlog package in Reduce and the other using the Regular Chains Library in Maple. Both will make use of Cylindrical Algebraic Decomposition (CAD) [1] to solve the problem. The worst-case time complexity of CAD is doubly exponential.222Traditionally, doubly exponential in the number of variables. However recent progress on CAD in the presence of equational constraints [10], such as (1) with for left-hand side, allows us to conclude it is actually doubly-exponential the number of variables minus the number of equational constraints at different levels of the projection [11]. Our approaches admit, in principle, arbitrary numbers of indeterminates. However, for the sake of realistic computation times we must restrict ourselves to one free parameter. Even then, the number of variables present is too large for contemporary CAD implementations. We make progress by combining CAD with additional symbolic methods. Our first approach uses virtual substitution techniques and the second real triangularization. In both cases we have combined the corresponding methods by hand, but automation is clearly possible.

2.1.1 Real Quantifier Elimination in Redlog

Real Quantifier Elimination (QE) can directly handle the parametric existence of steady states, taking as input , possibly with substitutions for some parameters. However, we are not only interested in the existence but also in the number of solutions. We are going to combine Virtual Substitution (VS) [25] with CAD. The former smoothly eliminates the majority of the quantifiers while the latter allows us to count numbers of solutions via decomposition of the remaining low-dimensional spaces. That combination of methods requires the solution of several QE runs with each problem and some combinatorial arguments. Throughout this subsection we are using Redlog [9].

Parameter-Free Computations

We consider where all parameters have been substituted with rational numbers. The closed formula states the existence of a suitable real solution. In a first step, we solve for the following eleven QE problems using VS:

Each is a univariate quantifier-free formula describing all possible real choices for for which there exist real choices for all other variables such that holds. CAD can easily decompose the corresponding one-dimensional spaces. It turns out that for each there are exactly three zero-dimensional cells , , for which holds. We extract all , , and as real algebraic numbers, i.e., univariate defining polynomials with integer coefficients plus isolating intervals. By combinatorial arguments it is not hard to see that the following holds for the set of real solutions of :

Notice that at this point we have proven multistationarity for . We can furthermore compute by plugging the candidates from the Cartesian product into

. A straightforward approach requires arithmetic with real algebraic numbers followed by the determination of the signs of the results, which is quite inefficient in practice. We use instead a heuristic approach combining refinements of the isolating intervals of the real algebraic numbers with interval arithmetic. This excludes

of the candidate solutions. The three remaining candidates require no further checking since we already know that . The overall CPU time is 71.3 seconds for 11 runs of VS plus 11 runs of CAD, followed by 16 hours for checking candidates.333All QE-related computations have been carried out on a 2.4 GHz Intel Core i7 with 3 GB RAM or cores on a compute server with similar speed and memory limitations. Our checking procedure is a file-based prototype starting a Reduce process for every single of the candidates; there is considerable room for optimization.

For instead of all eleven univariate CAD computations yield unique solutions which can be straightforwardly combined to one unique solution for the corresponding . The overall CPU time here is 66.4 seconds for 11 runs of VS plus 11 runs of CAD. Machine float approximations of all our solutions are given in Table 1.

Table 1: The unique solution for and the three solutions , …,  for . We have actually computed real algebraic numbers, which are pairs of univariate polynomials and isolated intervals. For convenience we are giving machine float approximations here, which can be made arbitrarily precise.
Parametric Analysis for

We now consider leaving as a parameter. Again, we solve for eleven QE problems using VS:

This time each is a bivariate quantifier-free formula in and the corresponding . This time we construct a two-dimensional CAD for each . The projection order is important: we first project , then the CAD base phase decomposes the -axis, followed by an extension phase that decomposes the -space over the -cells obtained in the base phase. This is feasible with one limitation: we do not extend over zero-dimensional -cells. In other words, we accept finitely many blind spots in parameter space, which we can explicitly read off from the CAD so that in the end we know exactly what we are missing.

Figure 1: The pruned CAD tree for . Ellipses and rectangles are full-dimensional and zero-dimensional cells, respectively. We have removed cells where is negative or where the input formula is false.

Figure 1 shows our CAD tree for . The first layer next to the root shows the decomposition of the -axis. The five zero-dimensional (rectangular) cells are the previously mentioned blind spots, among which the smallest one with negative value of is not relevant. Those zero-dimensional cells also establish the limits of the full dimensional (oval) cells in between. The cylinders over those one-dimensional -cells each contain either one or three zero-dimensional -cells where holds. We have deleted from the tree all -cells where does not hold. We make two observations, important for a qualitative analysis of our system:

  1. For all positive choices of —extending to infinity—there is at least one positive solution for .

  2. There is a break point around where the system changes from unique solutions to exactly three solutions.

Recall that for all floating point numbers given here as approximations we in fact know exact real algebraic numbers. For instance, the exact break point is the only real zero in the interval of an irreducible defining polynomial

(6)

Figure 2 depicts all eleven CAD trees for , …, .

Figure 2: All CAD trees for , …, . For positive there are always either one or three positive solutions for the corresponding . The break point from one to three solutions is the same in all trees. In the second but last row on the left hand side there is the tree for , which is shown in more detail in Figure 1.

They are quite similar to the one just discussed. Even the break point from one to three solutions for is identical for all so that we can generalize our observations:

  1. For all positive choices of —extending to infinity—there is at least one positive solution for .

  2. There is a break point around where the system changes its qualitative behavior. We have exactly given as a real algebraic number in Equation (6). For there is exactly one positive solution for . For there are at least and at most positive solutions for .

The overall computation time for our parametric analysis is 4.3 minutes. It is strongly dominated by 2.8 minutes for the computation of one particular CAD tree, for . It turns out that the suitable projection order with eliminated first is computationally considerably harder than projecting the other way round. As a preprocessing step we apply CAD-based simplification of the with the opposite, faster, projection order. Here we use Qepcad B, which performs better than Redlog at simple solution formula construction.

2.1.2 Triangular Decomposition methods with the Regular Chains Library

We now describe an alternative approach to the solution using regular chains methods. Regular chains are the triangular decompositions of systems of polynomial equations (triangular in terms of the variables in each polynomial). Highly efficient methods for working in complex space have been developed based on these; see [23] for a survey.

Recent work by Chen et al. [4] proposes adaptations of these tools to the real analogue: semi-algebraic systems. They describe two algorithms to decompose any real polynomial system into finitely many regular semi-algebraic systems. The first does so directly while the second, Lazy Real Triangularize (LRT) produces the highest dimension solution component and unevaluated function calls, which if all evaluated would combine to give the full solution. These algorithms are implemented in the Regular Chains Library444www.regularchains.org in Maple which we use throughout this subsection.

We apply LRT on the quantifier-free formula (5) evaluated with the parameter estimates for , …,  given at the start of Section 2, so we have one free parameter as in the previous section. We need to choose a variable ordering: our analysis requires that be the indeterminate considered alone; the remaining variables are placed in lexicographical order (the in-built heuristics to make the choice could suggest nothing better). The solutions must hence contain constraints in , constraints in (, in ( and so on. We define the main variable of a constraint to be the highest one present in this ordering.

LRT produces one solution component and 6 unevaluated function calls in less than 3 seconds. In the evaluated component: for each of , …,  there is a single equation which had this as the main variable. Further, these are all linear in their main variable meaning they can be easily rearranged into the solution formulae in Table 3.

Table 2: Coefficients and of polynomials occurring in Equations (6) and (7), respectively.
Table 3: Triangular solution formulae valid for all positive excluding three isolated points

The constraints on are that and that a polynomial equation of degree be satisfied:

(7)

where the coefficients are univariate polynomials in of maximum degree as given in Table 2.

Finally, the constraints on are that it be positive; it not be a root of the polynomial in Equation (6); nor two other polynomials as described in Table 4.

polynomial in (6)
Table 4: Constraints on for solution formulae in Table 3 and Equation (7) to be valid

Thus this solution component is valid for all positive values of excluding three points. As before, we could give these as exact algebraic numbers but for brevity give float approximations: , , and .

Three of the six unevaluated function calls define the solutions at these points, however evaluating these solutions is not possible in reasonable time. The other three define empty solution sets (evaluating to discover this is instantaneous). So, as with our previous approach, we proceed accepting a small number of blind spots.

The output of LRT has quickly given us the structure of the solution space valid at all but three isolated values of . However, it does not identify where the number of real solutions change: although the break point identified earlier has been rediscovered there is no information from which we can infer its significance; and there is no significance in our application of the other two isolated points.

To finish the analysis we need to decompose -space according to the real roots of ; and also since the constraint was specified separately in the output (the case for this variable only). CAD is ideally suited for this task. Using the Regular Chains algorithm [5] in Maple a CAD for divides the plane into 135 cells in a few seconds. This CAD decomposes the axis into 11 cells, i.e. identifying five points which approximate to: , , , , and .

On the cell for , the cylinder above in the plane is divided into 11 cells: three of which cover (two 2d sectors and a 1d section). This indicates that has a single positive real solution for such . On the two cells for and the cylinders above are divided into 15 cells; seven of which cover . This indicates that has three positive real solutions for such .

At the end of this analysis we have rediscovered the break point where the system moves from a single positive real solution to three. We also have explicit solutions valid for all except three isolated values. To obtain a solution select the value of interest then identify the real roots of (we know in advance how many depending on the value chosen); then for each solution substitute recursively into the equations of Table 3; starting from the bottom and including the new variable solution discovered from each substitution into the next. The solutions in Table 1 may be easily rediscovered this way.

Repeating the Process for Different Choices of the Lone Free Parameter/Fixed Parameter Values

We may repeat the approach described above for different choices of free parameter and different choices of fixed parameter values. For example:

  • With set to instead of we find that the break point between 1 and 3 real positive solutions moves to . With set to it moves to .

  • Allowing to be free and fixing we find that there is only ever one positive real solution.

  • Allowing to be free and fixing we find the number of positive real solutions moving from 1 to 3 to 1 breaking at and .

  • Similarly, allowing to be free and fixing we find there is only ever one positive real solution; but fixing instead we find 3 real solutions between and and 1 otherwise.

The results above hint that there is a shape approximating a narrow paraboloid in -space within which bistability may occur; with bistability available for any and value but bounded from below in the coordinate. We note that these additional experiments all produce results which, as with the one described in detail, are invalid at a handful of isolated values of the free parameter.

2.2 Stability of the Fixed Points

We use the three linear conservation constraint equations (3) to eliminate , , and from system (1) and symbolically compute the Jacobian

of the obtained reduced system. We then numerically compute the eigenvalues of

for the instances arising from the substitution of the different positive fixed points for the variables and the corresponding parameter values.

We have used the float approximations for the unique solution with and the three solutions , …,  for in Table 1. For the single positive fixed point the Jacobian has eigenvalues with negative real part only and hence can be shown to be stable. For one of the three positive fixed points can be shown to be unstable, as has one eigenvalue with positive real part; the other seven had negative real parts. In contrast and can be shown to be stable. Hence for the system is indeed bistable.

A verification of the stability of the fixed points using exact real algebraic numbers by the well-known Routh–Hurwitz criterion is possible algorithmically [15], but seems to be out of range of current methods for this example. Notice that also in other studies on multistationarity of signalling pathways [6, 14] the question of stability has not been addressed.

2.3 Numerical Homotopy Methods

Finally, we compare our symbolic results with numerical ones obtained using the homotopy solver Bertini [2]. Bertini computes complex roots of polynomial systems using methods from numerical algebraic geometry [22].

For the parameter values as above and we obtain six solutions, three of which are positive real solutions. For we obtain a single positive solution. In both cases the relevant solutions coincide with the ones obtained with our symbolic analyses up to the used numeric precision.

However, for larger values of Bertini produces incorrect results due to numerical instability. For instance, we falsely obtain exactly one positive real solution for and no positive real solution for .

Figure 3 shows a Bertini-based grid sampling of parameter regions, varying between 200 and 1000 and fixing one of and while varying the other among the default values (4). While this suffers from the discrete nature of sampling and potentially unreliable results as discussed, it is nevertheless useful for the generation of hypothesis about the nature of the parameter regions. Figure 3 seems to identify a region of bistability (in blue) within the parameter space, as hypothesised at the end of Section 2.1.2. The results of Bertini indicate holes in this region (the green dots within the blue). However, computation at these particular points reveals these to be the result of numerical errors: where an insufficiently high precision causes what is actually a positive real solution to appear to have a negative component. It seems there is scope for fruitful interplay between symbolic and numeric methods here; with numerics postulating hypotheses for the symbolic methods to check and refine.

Figure 3: Grid sampling two-parameter regions using Bertini. We combine with (left) and with (right). Colors indicate the computed numbers of positive real fixed points: blue 3, green 2, yellow 1, red 0. The dotted lines indicate values of the parameters as in Equation (4).

3 Conclusions and Future Work

We have shown that the determination of multistationarity of an 11-dimensional MAPK network can be achieved by combinations of currently available symbolic computation methods for mixed equality/inequality systems if, for all but potentially one parameter, numeric values are known. The aspiration of a semi-algebraic description of the ranges for all parameters in the conservation laws (3) yielding multistationarity will now be pursued, with the present results demonstrating that this aspiration may be within reach.

As there are many very relevant systems having dimensions between 10 and 20 it seems to be worth the effort to enhance and improve the present algorithmic methods, and in particular their combination, to solve such important application problems for symbolic computation.

Acknowledgments

For our QE-related computations we used two great free software tools: GNU Parallel for distributing computations on several processors, and yEd for visualization of CAD trees. D. Grigoriev is grateful to the grant RSF 16-11-10075 and to MCCME for wonderful working conditions and inspiring atmosphere. M. Košta has been supported by the DFG/ANR Project STU 483/2-1 SMArT. H. Errami, A. Weber, and O. Radulescu thank the French-German Procope-DAAD program for partial support of this research. J. Davenport, M. England and T. Sturm are grateful to EU CSA project 712689 SC2.

References

  • [1] D. S. Arnon, G. E. Collins, and S. McCallum. Cylindrical algebraic decomposition I: The basic algorithm. SIAM J. Comput., 13(4):865–877, 1984.
  • [2] D. J. Bates, J. D. Hauenstein, A. J. Sommese, and C. W. Wampler. Bertini: Software for numerical algebraic geometry. doi:10.7274/R0H41PB5.
  • [3] U. S. Bhalla and R. Iyengar. Emergent properties of networks of biological signaling pathways. Science, 283(5400):381–387, 1999.
  • [4] C. Chen, J. Davenport, J. May, M. Moreno Maza, B. Xia, and R. Xiao. Triangular decomposition of semi-algebraic systems. J. Symb. Comput., 49:3–26, 2013.
  • [5] C. Chen, M. Moreno Maza, B. Xia, and L. Yang. Computing cylindrical algebraic decomposition via triangular decomposition. In Proceedings of the ISSAC 2009, pages 95–102. ACM, 2009.
  • [6] C. Conradi, D. Flockerzi, and J. Raisch. Multistationarity in the activation of a MAPK: parametrizing the relevant region in parameter space. Math. Biosci., 211(1):105–31, 2008.
  • [7] C. Conradi and M. Mincheva. Catalytic constants enable the emergence of bistability in dual phosphorylation. Journal of The Royal Society Interface, 11(95), 2014.
  • [8] G. Craciun, A. Dickenstein, A. Shiu, and B. Sturmfels. Toric dynamical systems. J. Symb. Comput., 44(11):1551–1565, 2009.
  • [9] A. Dolzmann and T. Sturm. Redlog: Computer algebra meets computer logic. ACM SIGSAM Bulletin, 31(2):2–9, 1997.
  • [10] M. England, R. Bradford, and J. Davenport. Improving the use of equational constraints in cylindrical algebraic decomposition. In Proceedings of the ISSAC 2015, pages 165–172. ACM, 2015.
  • [11] M. England and J. Davenport. The complexity of cylindrical algebraic decomposition with respect to polynomial degre. In Proceedings of the CASC 2016, volume 9890 of LNCS, pages 172–192. Springer, 2016.
  • [12] M. Feinberg. Stability of complex isothermal reactors–I. the deficiency zero and deficiency one theorems. Chem. Eng. Sci., 42(10):2229–2268, 1987.
  • [13] D. Grigoriev and N. N. Vorobjov. Solving systems of polynomial inequalities in subexponential time. J. Symb. Comput., 5:37–64, 1988.
  • [14] E. Gross, H. A. Harrington, Z. Rosen, and B. Sturmfels. Algebraic systems biology: A case study for the Wnt pathway. Bull. Math. Biol., 78(1):21–51, 2016.
  • [15] H. Hong, R. Liska, and S. Steinberg. Testing stability by quantifier elimination. J. Symb. Comput., 24(2):161–187, 1997.
  • [16] M. D. Johnston. A note on “MAPK networks and their capacity for multistationarity due to toric steady states”. arXiv:1407.5651, 2014.
  • [17] B. Joshi and A. Shiu. A Survey of Methods for Deciding Whether a Reaction Network is Multistationary. Math. Model. Nat. Phenom., 10(5):47–67, 2015.
  • [18] C. Li, M. Donizelli, N. Rodriguez, H. Dharuri, L. Endler, V. Chelliah, L. Li, E. He, A. Henry, M. I. Stefan, J. L. Snoep, M. Hucka, N. Le Novère, and C. Laibe. BioModels database: An enhanced, curated and annotated resource for published quantitative kinetic models. BMC Systems Biology, 4:92, 2010.
  • [19] N. I. Markevich, J. B. Hoek, and B. N. Kholodenko. Signaling switches and bistability arising from multisite phosphorylation in protein kinase cascades. J. Cell Biol., 164(3):353–359, 2004.
  • [20] M. Pérez Millán and A. G. Turjanski. MAPK’s networks and their capacity for multistationarity due to toric steady states. Math. Biosci., 262:125–37, 2015.
  • [21] S. Schuster and T. Höfer. Determining all extreme semi-positive conservation relations in chemical reaction systems: a test criterion for conservativity. J. Chem. Soc. Faraday T., 87(16):2561–2566, 1991.
  • [22] A. J. Sommese, J. Verschelde, and C. W. Wampler. Introduction to numerical algebraic geometry. In Solving Polynomial Equations: Foundations, Algorithms, and Applications, pages 301–337. Springer, 2005.
  • [23] D. Wang. Elimination Methods. Springer, 2000.
  • [24] D. Wang and B. Xia. Stability analysis of biological systems with real solution classification. In Proceedings of the ISSAC 2005, pages 354–361. ACM, 2005.
  • [25] V. Weispfenning. Quantifier elimination for real algebra—the quadratic case and beyond. Appl. Algebr. Eng. Comm., 8(2):85–101, 1997.
  • [26] G. Weng, U. S. Bhalla, and R. Iyengar. Complexity in biological signaling systems. Science, 284(5411):92–6, 1999.