Synthesis for Vesicle Traffic Systems

10/10/2018 ∙ by Ashutosh Gupta, et al. ∙ 0

Vesicle Traffic Systems (VTSs) are the material transport mechanisms among the compartments inside the biological cells. The compartments are viewed as nodes that are labeled with the containing chemicals and the transport channels are similarly viewed as labeled edges between the nodes. Understanding VTSs is an ongoing area of research and for many cells they are partially known. For example, there may be undiscovered edges, nodes, or their labels in a VTS of a cell. It has been speculated that there are properties that the VTSs must satisfy. For example, stability, i.e., every chemical that is leaving a compartment comes back. Many synthesis questions may arise in this scenario, where we want to complete a partially known VTS under a given property. In the paper, we present novel encodings of the above questions into the QBF (quantified Boolean formula) satisfiability problems. We have implemented the encodings in a highly configurable tool and applied to a couple of found-in-nature VTSs and several synthetic graphs. Our results demonstrate that our method can scale up to the graphs of interest.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Eukaryotic cells, including human cells, consist of multiple membrane-bound compartments. Material is transported among these compartments by the vesicle transport system (VTS). Briefly, the source compartment produces a membrane-bound packet of molecules called a vesicle. After release, this vesicle specifically recognizes the correct target compartment within the cell, and fuses with it [1]. A lot of information about the molecules that form the machinery of the VTS has been discovered, including their regulatory interaction with each other [2]. In spite of this detailed knowledge at the level of the molecules, the structure of the VTS network, or the road-map of the eukaryotic cell, is far from complete. For example, although the localization of various SNAREs –a class of molecules that participate in the control of VTS– in the cell is known, and also their site of action [3], for most SNAREs, how they first reached the compartments they reside in is not known. The current knowledge of the network is put together from a patchwork of biological experiments and is scattered across several publications. Even after this information is collected and put together, we find that the network obtained is still not complete; new vesicles and new contents in previously known vesicles are constantly being discovered (some new discoveries include [4, 5, 6, 7]). The synthesis for the unknown pieces may be assisted by computation on the graph model of VTSs. In this paper, we are looking at the computational questions arising from the VTSs.

VTSs are regulated by the same molecules that they transport. For the purpose of this paper, the VTS molecules we focus on are the transmembrane SNARE proteins. SNAREs drive the recognition of the target compartment by vesicles and their subsequent fusion. The SNAREs can be divided into v-SNAREs (which are present on vesicles) and t-SNAREs (which are present on compartments). A vesicle fuses with a compartment if its v-SNARE can form a complex with the t-SNARE present on that compartment. Not all v- and t- SNARE combinations can form complexes; this constraint forms part of the basis for the specificity of vesicle traffic [8].

We use the model of VTSs that has been presented in [9]. Please look at Appendix 0.A for a detailed discussion on pros and cons of the model. We model the system as a labeled graph, where compartments are nodes and transport vesicles are edges. The molecular compositions of the compartments and vesicles are the node and edge labels respectively. The molecules can be active or inactive on any a compartment or vesicle. The activity states of molecules are also included in the labels. Due to the biology of SNAREs of the VTSs our interest, a vesicle is enabled by a set of four molecules such that one part of the set occurs in the vesicle and the other part occurs in the target of the vesicle compartment. The partition always divides the set in the set of three and one molecules. The enabling molecules must be active in the vesicle and target compartment respectively. The pairs are called fusing sets and analogously the vesicle is considered to be fused with the destination compartment. Not all sets of molecules can participate in the fusion; in the biological cells, fusogenic SNARE complexes are discovered through experiments. Generally, the fusing pairs are found to be distinct for distinct vesicle-compartment fusions. To ensure that a molecule that has participated in a fusion does not interfere with fusion at compartments and vesicles, in the model, we require that the molecule is inactive on appropriate compartments. The activity of molecules is regulated by the other molecules, i.e., the presence and absence of the other molecules in a compartment or vesicle may make the molecule active or inactive. We call this regulation as activity functions. The regulation controls are defined by a fusion pairing relation containing pairs of molecules and activity Boolean functions.

In the model, we assume that the system is in steady state and the concentrations of the molecules in compartments do not change over time. Since our system is in steady state, we expect that any molecule that leaves a compartment must come back via some path on the graph. We call this property of VTS as stability.

As we have discussed earlier, our understanding of VTSs is partial. The synthesis of the unknown pieces may be assisted by computation on the graph model of VTSs. In this paper, we consider several versions of the synthesis problem involving different parts of VTSs that can be synthesized, such as modifying labels, adding/deleting edges, and learning activity function. We also consider variations on the properties against which we do synthesis, namely stability, and -connectedness that states that the VTS remains connected after removing any edges. We have assumed that the given partial VTS is always well-fused whereas properties like stability and k-connectedness may not hold in the partial VTS. In order to synthesize the parts of a VTS such that it satisfies the constraints, we encode the synthesis problem into one of satisfiability of quantified Boolean formulas(QBFs).

We have implemented the encoding in a flexible tool, which can handle a wide range of synthesis queries. We have applied our tool on several VTSs including two found-in-nature VTSs.

Our experiments suggest that some of the synthesis problems are solvable by modern solvers and the synthesis technology may be useful for biological research.

The rest of the paper is organized as follows. In section 2, we present the graph model of VTSs and encoding of several constraints on VTSs. In section 3, we present the synthesis problems and their encoding into QBF satisfiability. In section 4, we present our implementation and experimental results. We discuss related work in 5 and conclude in section 6.

2 Preliminaries

In this section, we will present the model of VTS from [10]. We will also present the constraints and properties on the VTSs, and their encoding as a QBF formula. We model a VTS as a labelled graph along with assisting pairing matrices and activating functions.

Definition 1

A VTS is a tuple , where

  • is a finite set of nodes representing compartments in the VTS,

  • is the finite set of molecules flowing in the system,

  • is the set of edges with molecule sets as labels,

  • defines the molecules present in the nodes,

  • is pairing relation,

  • is activity maps for nodes, and

  • is activity maps for edges.

, , , and define a labelled graph. Additionally, defines which molecules can fuse with which molecules, and and are the activity functions for molecules on nodes and edges respectively. The model captures the steady state of a VTS. The analysis of the model will inform us about the network/graph properties of VTSs.

A molecule is active at node if and is true. A molecule is active at edges if and is true. We call well-structured if molecules is divided into two partitions and such that for each , and for each , and . In other words, molecules are of two types and , pairing relations have sets of four molecules such that three are of one type and one is of another type (motivated by the biochemistry of the fusion), there are no self loops, and each edge carry only those molecules that are present in its source and destination nodes. An edge fuses with a node if there are non-empty set of molecules and such that are active in the edge, are active in , and . We call well-fused if each edge fuses with its destination node and can not fuse with any other node.

A path in is a sequence of nodes such that for each . For a molecule , an -path in is a sequence of nodes such that and for each . A node is (-)reachable from node in if there is a (-)path in . We call stable if for each and , is -reachable from . We call connected if for each , is reachable from in . We call -connected if for each and , VTS is connected.

2.1 Encoding VTS

The conditions on the VTSs for a given size can be encoded as a QBF formula with uninterpreted functions. To encode the constraints, we need variables for each aspect of VTS. Let us suppose that the size of the graph is and a number of molecules are . To fully finitize the problem, we also limit the maximum number of edges present between two nodes. Here, we list the Boolean variables and uninterpreted function symbols that encode parts of VTSs.

  1. Boolean variable indicates if

  2. Boolean variable indicates if th edge exists between and .

  3. Boolean variable indicates if th edge between and contains .

  4. Boolean variable indicates if

  5. uninterpreted Boolean functions encoding map

  6. uninterpreted Boolean functions encoding map

We also have auxiliary Boolean variables that will help us encode the well-fused property.

  1. indicates that molecule is active at node , i.e., holds

  2. indicates that molecule is active at th edge between and , i.e., holds

We will describe several constraints that encode VTSs in this section. In the next section, we will extend the encoding for the synthesis problem. To avoid cumbersome notation, we will not explicitly write the ranges of the indexing in the constraints. and will range over nodes, i.e., from to . will range over molecules, i.e., from to . will range over edges between two nodes, i.e., from to .

The following constraints encode the basic consistancy of VTSs.

EdgeC states that each edge has at least one molecule, there are no self loops, and edge labels are consistent with node labels. ActivityC states that active molecule are present. PairingC states that all molecules are divided into two types using bit, which encodes if belongs to one type or another, and any fusing set of molecules must have three molecules involved from one type and one molecule from the other. Fusion1, and Fusion2 states the well-fused condition. Consistancy is the conjunction of all of the above.

Activity functions

We also need to encode that the activity of the molecules are controlled by activity functions. The input VTS may include concrete activity functions for some molecules, and for the others the functions may be unknown and to be synthesized. The concrete functions can be given to us in many different ways, for example as a lookup table, or a concise Boolean formula. In the following section, we will assume the appropriate encoding is used for the concrete functions and represent them by NodeFun and EdgeFun for node and edge regulations respectively. We will use and to represent functions that are unknown in a VTS. Later we will be synthesizing the unknown activity functions and replace and with parameterized constraints that encode a space of candidate functions.

2.2 VTS properties

For the synthesis of incomplete systems, we need properties against which we synthesize the missing parts. Here we will discuss two such properties proposed in earlier works [10], namely stability and -connectedness.

Stability property

We use Boolean variable to indicate if there is an -path from to of length less than or equal to . We use -reachability to encode the stability condition in VTSs. The following constraint recursively encodes that node is -reachable from node in less than steps. Subsequently, we encode stability condition using the reachability variables.

Stability
-connected property

-connectedness expresses robustness against failure of few edges. Let us use to indicate th edge between and is failed and to indicate if there is a path from to in the modified VTS. In the following, encodes that only existing edges can be failed and exactly edges are failed. defines reachability in the modified VTS. We use a new variable to encode reachability from to in at most steps. says that all nodes are reachable from any other node.

We will be synthesizing -connected graphs. We define that says for all possible valid failures the graph remains reachable.

Since variables in are universally quantified,

introduces quantifier alternations. Therefore, synthesis against this property will require QBF reasoning. We may make the formula quantifier free by considering all possible failures separately and introducing a vector of reachability variables for each failure. However, this will blow up the size of the formula and may not be solvable by a SAT solver.

3 Synthesis for VTS

In this section, we will present a list of synthesis problems that may arise from the partially available information about a VTS and our synthesis method for the problems.

3.1 Problem Statements

We will assume that we are given a VTS, whose all components are not specified. Our objective is to find the missing parts. The missing parts can be in any of the components of VTS. For example, some undiscovered edges or nodes, or insufficient knowledge about the presence of molecules in some part of the VTS. To cover most of the likely variations of this missing information, we have encoded the following variants of VTS synthesis problem.

  1. Fixing VTS by adding edges

  2. Fixing VTS by adding molecules to the labels

  3. Fixing VTS by learning activity functions

  4. Fixing VTS by both adding/deleting parts

3.2 Encoding Incomplete VTS

In our synthesis method, we take a VTS as input. We allow activity functions not to be specified. We construct the following constraints to encode the available information about . We encode both the present and the absent components in . Later, the constraints will help us encode the synthesis problems.

PresentE
PresentN
PresentP
KnownActiveN
KnownActiveE
PresentCons
KnownActiveN

We also collect the variables that are not set to true in PresentCons.

AbsentELabel
AbsentE
AbsentNLabel
AbsentP
UnknownActive

We have defined AbsentELabel, AbsentE, AbsentN, and AbsentP as sets. They will be converted into formulas depending on the different usage in the synthesis problems.

3.3 Encoding synthesis property

We will do synthesis against the following property that says the VTS is stable and 3-connected.

The property was proposed in [9]. However, the biological relevance of the property is debatable and open for change. Our tool is easily modifiable to support any other property that may be deemed interesting by the biologists.

3.4 Encoding synthesis constraints

Now we will consider the encodings for the listed synthesis problems. The presented variations represent the encodings supported by our tool. Additionally, the combinations of the variation are also possible and our tool easily supports them. For simplicity of the presentation, we assume that if we are synthesizing an aspect of VTS, then all other aspects are fully given. Therefore, we will describe two kinds of constraints for synthesis problems. One will encode the variable part in the synthesis problem and the other encodes the fixed parts. Subsequently, the two constraints will be put together with Consistancy and Property to construct the constraints for synthesis.

3.4.1 Fixing VTS by adding edges

Now we will consider the case when we add new edges to VTS to satisfy the properties. In the following, the pseudo-Boolean formula AddE encodes that at most new undeclared edges may be added in the VTS. FixedForEdge encodes the parts of the VTS that are not allowed to change.

We put together the constraints and obtain the following formula.

Similar to what we have seen Consistancy encodes the basic constraints about VTS, Property  encodes the goal, and the rest two are defined just above. A satisfying model of SynthE will make some of the edges in AbsentE true such that Property is satisfied. We limit the addition of the edges, since we look for a fix that require minimum number of changes in the given VTS. We start with and grow one by one until becomes satisfiable.

In the later synthesis problems, we will construct a similar QBF formula with same first two parts and the last two are due the requirements of the synthesis problem.

Fixing VTS by adding molecules to the labels:

The system may also be fixed only by modifying labels on the edges or the nodes instead of adding edges. Here let us consider only adding molecules to the labels of edges. In the following, the formula encodes that only edge labels may be added.

Similar to the previous encoding, we solve the satisfiability of the above formula to obtain additional molecules that may be added to the edge labels to satisfy the properties.

3.4.2 Fixing VTS by learning activity functions:

Now we consider a scenario where some of the activity functions for some of the molecules are missing. The activity functions are -input Boolean functions. First, we choose a class of formulas for the candidate functions. We encode the candidates in a formula with parameters. By assigning different values for the parameters, a solver may select different candidates for the activity functions. We will illustrate only one class of formulas. However, we support other classes of formulas, for example, -CNF.

In the following, the formula NNFTemplate encodes a set of negation normal form functions that take as input and contain literals. We use Gate to encode a gate that takes a parameter integer to encode various gates. We use Leaf to encode the literal at some position. Both are stitched to define NNFTemplate. To encode the set of NNF formulas with literals, it has finite-range integer variables as parameters.

Using the template we define the constraints that encodes the candidate functions that satisfy the activity requirements, where is the vectors of parameters for encoding parameters for each molecule, and limits the size of the candidate functions. We fix the all other aspects of the VTS to be fixed via constraints FixedForFunctions.

We construct similar to the earlier variations. By reading of the values of in a satisfying model of the formula, we learn the synthesized function.

3.5 Fixing VTS by both adding/deleting parts:

Now we will consider repairing of VTS by allowing not only addition but also deletion of the molecules, edges, functions, or pairing matrix. We have encoded the repairing in our tool by introducing flip bits for each variable that is modifiable in the VTS. We illustrate the repairing on one class of variables and rest can be easily extended. Let us consider repairing of node labels. For each bit , we create a bit . We add constraints that take xor of VTS assigned values for and . We also limit the number of that can be true, therefore limiting the number of flips. The above constraints are encoded in .

Similar to the earlier variations, we construct for the repair. In that, FixedForNodeRepair encodes all the parts of VTS that do not change.

A satisfying model of will assign some bits to true. We will learn from the assignments the needed modifications in the VTS.

4 Implementation and Experiments

Add Add Learning NNF Learning Add/Delete
Table a edge molecules (only and ) k-CNF parts
Time #C Time #C Time #C Time #C Time #C
plos1-dia[3C] 0.326 0.312 0.669 0.966 0.277 -1 E, -1 AE, -1 AN. +1 E, +1 N.
plos2-dia[4C] 0.266 0 0.322 0 1.409 0 2.114 0 0.337 0
sub-mammal[3C] 0.767 1 E 1.049 5 PE 3.523 1E 4.961 1E 1.172 -1 E, -2 PE, -1 AN. +1 E, +4 PE, +4 N, +2 AN, +2 AE.
node4[3C] 1.554 1 E 3.859 12 PE 5.286 4.502 2.194 -2 E, -2 PE, -1 N, -1 AN, -1 AE. +12 N, +8 E, +1 PE.
yeast-graph[3C] 95.016 2 E timeout N/A 1571.42 2 E 530.210 2 E 72.316 -1 E, -1 N, -1 AE, -1 AN, -1PE. +2 E, 7 PE, 8 N.
mammal-graph[3C] timeout N/A timeout N/A timeout N/A timeout N/A timeout N/A
Add Add Learning NNF Learning Add/Delete
Table b edge molecules (only and ) k-CNF parts
Time #C Time #C Time #C Time #C Time #C
plos1-dia 0.041 0.320 0.225 0.33 3.74 -1 E, -1 PE, - 1 N, -1 PE. +1 AE, +1 PE, +1 N
plos2-dia 3.97 0 2.647 0 5.941 0 5.680 0 3.56 0
sub-mammal 3.483 1 E 4.379 5 PE 29.980 1 E 10.405 1 E 3.650 -1 E, -2 PE, -1 AN. +1 E, +4 PE, +4 N, +2 AN, +2 AE
node4 4.150 1 E 10.562 12 PE 3.401 4.760 5.05 -2 E, -2 PE, -1 N, -1 AN, -1 AE. +12 N, +8 E, +1 PE
yeast-graph 40.225 2 E timeout N/A 1393.84 2 E 468.161 2 E 69.81 -1 E, -1 N, -1 AE, -1 AN, -1PE. +2 E, 7 PE, 8 N.
mammal-graph timeout N/A timeout N/A timeout N/A timeout N/A timeout N/A
Figure 1: Run-times for synthesis queries. #C stands for minimum changes in the synthesized VTS in comparison with the given partial VTS. Time is reported in seconds. (a) The solver used is DepQBF (b) The solver used is Z3. The sub-mammal is a subgraph of the complete mammal-graph. In the Add/Delete parts column, ‘+’n sign is used to show the addition of n number of the molecules, similarly ‘-’n is used to show the removal of n number of molecules. In the table, N is node labels, AN is active node molecules, E is edges, PE is molecule presence on the edge and AE is active molecules on the edge. The [kC] stands for k graph connectedness which is part of only DepQBF experiments.

We have implemented the encodings in a tool called VTSSynth111https://github.com/arey0pushpa/pyZ3. The tool takes a partially defined VTS as input in a custom designed input language. The input is then converted to the constraints over VTS. The tool can not only synthesize the above-discussed queries, but also their combinations. For example, our tool can modify labels of nodes or edges while learning activity functions. Our tool is developed in C++ and uses Z3 [11] infrastructure for processing formulas. Since some of the formulas involve alternation of quantifiers over Boolean variables Z3 is not a suitable choice for those examples. We translate the formulas created by Z3 tool into a standard QDIMACS [12] format and use as an input for QBF solvers. We use DepQBF [13] for solving of QBF formulas. Our tool includes about 7000 lines of code.

We have applied VTSSynth on six partially defined VTSs. The results are presented in table 1 for both the solvers DepQBFand Z3. To use Z3, we remove Connected constraints, such that the queries becomes quantifier-free. The experiments were done on a machine with Intel(R) Core(TM) i3-4030U CPU @ 1.90GHz processor and 4GB RAM with 30 min (1800 sec) timeout. The first four VTSs are synthetic but inspire from literature for typical motifs in VTSs. The third VTS is a subgraph of the last VTS. The fifth VTS is taken from [14]. The last VTS represent mammalian SNARE map created by studying the literature references.

The table shows timing for various synthesis queries. For each synthesis query, we have two columns. One column reports the timing and the other reports the minimum changes needed to obtain a valid VTS. indicates that any number of changes with the synthesis query search space can obtain the VTS. In the table, we are reporting five synthesis queries The first one only adds new labelled edges to the graph. We have ranked the all possible graph edits with the simple rank of minimum updates. The second query adds new labels to the edge. The third query synthesizes NNF Boolean functions only containing and gates for activity functions, while allowing more edges to be added. The result shows the basic template of 4 leaves and 3 gates. To illustrate the versatility of our tool, the fourth query synthesizes -CNF functions (encoding not presented). Finally, we report queries that allows both addition and deletion of edges, and labels of node and labels.

5 Related Work

In recent years, there has been a wide range of methods developed for the similar synthesis problems [15, 16, 17]. They range from filling gaps an implementation of C programs from the pool of template predicates to learn a program from example runs of the program. In the course of developing such methods, the background technology, i.e. solving of quantified constraints has been evolving rapidly [13, 18].

There has been some work in applying synthesis technique in biology especially in gene regulatory networks [19, 20]. A very recent work [20] synthesize executable gene regulatory networks from single-cell gene expression data. Synthesis technique is also used in optimal synthesis for chemical reaction networks [21]. The [20] uses constraint (satisfiability) solving techniques for the synthesis whereas  [19] uses SMT for synthesis. The paper [21] in addition to using SMT over ODE, uses a template-guided approach. In our case queries contain quantifiers so we have employed QBF solving with Z3 for the solving the synthesis problem. To our best knowledge, this is the first application of synthesis in VTS.

6 Conclusion

In this paper, we presented encodings of the synthesis problems that may arise from VTSs. We demonstrated that our tool based on the encodings scale up to the relevant sizes of the VTSs for some synthesis queries. Our tool timed out on larger examples. We are working to improve the performance of our tool. We will take this tool to the biologists and develop wet experiments that may validate some synthesis results from the tool. Our model of VTSs is static graphs. In future, we will study the dynamic behaviors of VTSs. It will allow us to predict behaviors after the perturbations in the VTSs and more ways to test the predicted synthesis results.

References

  • [1] Bruce Alberts, Dennis Bray, Karen Hopkin, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and Peter Walter. Essential cell biology. Garland Science, 2013.
  • [2] Juan S Bonifacino and Benjamin S Glick. The mechanisms of vesicle budding and fusion. cell, 116(2):153–166, 2004.
  • [3] WanJin Hong and Sima Lev. Tethering the assembly of snare complexes. Trends in cell biology, 24(1):35–43, 2014.
  • [4] Natali L Chanaday and Ege T Kavalali. How do you recognize and reconstitute a synaptic vesicle after fusion? F1000Research, 6, 2017.
  • [5] Massimo D’Agostino, Herre Jelger Risselada, Anna Lürick, Christian Ungermann, and Andreas Mayer. A tethering complex drives the terminal stage of snare-dependent membrane fusion. Nature, 551(7682):634, 2017.
  • [6] Fiona R Rodepeter, Susanne Wiegand, Hans-Georg Lüers, Gabriel A Bonaterra, Anson W Lowe, Michael Bette, Ralf Jacob, and Robert Mandic. Indication for differential sorting of the rat v-snare splice isoforms vamp-1a and-1b. Biochemistry and Cell Biology, 95(4):500–509, 2017.
  • [7] Yani Zhao, Benjamin T Holmgren, and Andrea Hinas. The conserved snare sec-22 localizes to late endosomes and negatively regulates rna interference in caenorhabditis elegans. RNA, 23(3):297–307, 2017.
  • [8] Reinhard Jahn and Richard H Scheller. Snares–engines for membrane fusion. Nature reviews. Molecular cell biology, 7(9):631, 2006.
  • [9] Ankit Shukla, Arnab Bhattacharyya, Lakshmanan Kuppusamy, Mandayam Srivas, and Mukund Thattai. Discovering vesicle traffic network constraints by model checking. PloS one, 12(7):e0180692, 2017.
  • [10] Ashutosh Gupta, Ankit Shukla, Mandyam Srivas, and Mukund Thattai. Smt solving for vesicle traffic systems in cells. In SASB, 2017.
  • [11] Leonardo de Moura and Nikolaj Bjorner. Z3: An efficient smt solver. In TACAS, volume 4963 of LNCS, pages 337–340. Springer Berlin Heidelberg, 2008.
  • [12] Qbflib.org. QDIMACS standard ver. 1.1., 2018.
  • [13] Florian Lonsing and Armin Biere. Depqbf: A dependency-aware qbf solver. Journal on Satisfiability, Boolean Modeling and Computation, 7:71–76, 2010.
  • [14] Lena Burri and Trevor Lithgow. A complete set of snares in yeast. Traffic, 5(1):45–52, 2004.
  • [15] Armando Solar-Lezama. The sketching approach to program synthesis. In Programming Languages and Systems, 7th Asian Symposium, APLAS 2009, Seoul, Korea, December 14-16, 2009. Proceedings, pages 4–13. Springer, 2009.
  • [16] Rajeev Alur, Rastislav Bodík, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Portland, OR, USA, October 20-23, 2013, pages 1–8, 2013.
  • [17] Sumit Gulwani. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011, pages 317–330. ACM, 2011.
  • [18] Leonardo Mendonça de Moura and Nikolaj Bjørner. Efficient e-matching for SMT solvers. In Automated Deduction - CADE-21, 21st International Conference on Automated Deduction, Bremen, Germany, July 17-20, 2007, Proceedings, volume 4603, pages 183–198. Springer, 2007.
  • [19] Yoli Shavit, Boyan Yordanov, Sara-Jane Dunn, Christoph M Wintersteiger, Tomoki Otani, Youssef Hamadi, Frederick J Livesey, and Hillel Kugler. Automated synthesis and analysis of switching gene regulatory networks. Biosystems, 146:26–34, 2016.
  • [20] Jasmin Fisher, Ali Sinan Köksal, Nir Piterman, and Steven Woodhouse. Synthesising executable gene regulatory networks from single-cell gene expression data. In International Conference on Computer Aided Verification, pages 544–560. Springer, 2015.
  • [21] Luca Cardelli, Milan Češka, Martin Fränzle, Marta Kwiatkowska, Luca Laurenti, Nicola Paoletti, and Max Whitby. Syntax-guided optimal synthesis for chemical reaction networks. In International Conference on Computer Aided Verification, pages 375–395. Springer, 2017.
  • [22] William Wickner and Randy Schekman. Membrane fusion. Nature Structural and Molecular Biology, 15(n7):658, 2008.
  • [23] James E Rothman. The machinery and principles of vesicle transport in the cell. Nature medicine, 8(10):1059–1063, 2002.
  • [24] Geert Van Den Bogaart, Matthew G Holt, Gertrude Bunt, Dietmar Riedel, Fred S Wouters, and Reinhard Jahn. One snare complex is sufficient for membrane fusion. Nature Structural and Molecular Biology, 17(3):358, 2010.
  • [25] Wanjin Hong. Snares and traffic. Biochimica et Biophysica Acta (BBA)-Molecular Cell Research, 1744(2):120–144, 2005.

Appendix 0.A Discussion on the choice of VTS model

The molecules transported by the VTS are themselves its regulators. The molecules in a compartment/vesicle may be active or inactive. The molecules that are responsible for vesicle fusion are called SNARE proteins [8, 22]. Active SNAREs present on vesicles (v-SNAREs) bind with their cognate active SNAREs on the target compartment (t-SNAREs) to enable vesicle fusion. A cell contains multiple kinds of v- and t-SNAREs. Only specific pairs of v and t SNAREs can bind to each other and participate in fusion. Fusion compatible v- and t- SNAREs are determined by biological experiments. Different vesicle-compartment fusions in the cell are brought about by different v- and t-SNARE pairs. A molecule that participates in a given fusion reaction must not interfere with fusion at different compartments or vesicles. Therefore, SNAREs must be kept in an inactive form in appropriate compartments/vesicles. The activity of molecules is regulated by the other molecules, i.e., the presence and absence of the other molecules in a compartment or vesicle may make the molecule active or inactive. We call this regulation as activity functions. In the VTS model, we assume that the system is in steady state and the concentrations of the molecules in the compartments do not change over time. We define SNARE pairing specificity by a fusion pairing relation containing pairs of SNAREs and molecular regulation by activity Boolean functions. Since the system is in steady state, we expect that any molecule that leaves a compartment must come back via some path on the graph. We call this property of VTS as stability.

Our model is inspired by [9]. On the timescales of minutes, our following assumptions reasonably capture the important aspects of the Rothman-Schekman-Sudhof (RSS) model [23] of vesicle traffic system.

  1. A cell is a set of compartments exchanging vesicles.

  2. Compartments are neither created nor destroyed.

  3. Each compartment is in steady state, gain and loss balance.

  4. Molecules are neither created nor destroyed.

  5. Molecules move via vesicles of uniform size.

  6. Identical vesicles have identical target compartments.

  7. Fusion of vesicles to compartments is driven by specific SNARE pairing.

  8. The activity of a SNARE can be regulated by other molecules present on the same compartment or vesicle.

  9. An active SNARE pair is necessary and sufficient for fusion.

SNARE proteins are the agents of vesicle fusion in eukaryotic cells. When SNAREs on vesicles (v-SNAREs) encounter their cognate SNAREs on target compartments (t-SNAREs), they form SNARE complexes [8], and a single SNARE complex releases enough energy to enable membrane fusion [24]. SNAREs are identified by the presence of a conserved 60-70 stretch of amino acids called the SNARE motif. Based on amino acid sequence, SNARE motifs fall into 4 classes: Qa, Qb, Qc, and R [8]. Across all intracellular vesicle fusion reactions, the associated SNARE complexes contain one of each of the four kinds of SNARE motifs; the v-SNARE contributes a single SNARE motif, usually it is an R-SNARE (although, exceptions are known: Sec22b and Ykt6 are both R-SNAREs which form parts of t-SNAREs [25]) and the rest of the three SNARE motifs are contributed by the t-SNARE. In the cell, different vesicle fusion reactions are associated with distinct v- and t-SNARE pairs.

The paper [9] consider three Q SNARES as a single molecule, we have extended this model by considering each complex molecule as distinct. In contrast to the [9], we allow Q and R-SNARE type distribution across the whole system to be uneven. In our model fusion is driven by an active combination of three Q SNARE and one R SNARE molecule. We have relaxed the pairing matrix constraint to comply with this fact. For biological efficiency and optimality reasons, we do not allow self-edges to be present in the VTS.

Appendix 0.B The Natural VTSs

Here we will present the two VTS collected from the literature.

0.b.1 Mammalian VTS

GAQ̂a4 Q̂a6 Q̂b4 Q̂b6 Q̂c4 R̂6

IC Q̂a6 Q̂b6 R̂1

ER Q̂a1 Q̂b1 R̂1

PM Q̂a5 Q̂a7 Q̂bc2 Q̂bc7

EE Q̂a2 Q̂b2/3 Q̂c2/3

LE Q̂a8 Q̂b8 Q̂c8

Q̂c6

Q̂c6

Q̂c6

R6

Q̂c1

Qb2 Qc2

Qbc2/3

Qb2/3,Qa2,R2,R̂4,Qc2/3

Qb7 Qc7

R̂8,Qa7,Qbc7,R7

R̂3

R̂2 Qa7 Qbc7, R7, Qc7, Qa2

Qb2 Qc2

R̂2

R̂7

R̂8
Figure 2: A found-in-nature VTS. Nodes and edges are labelled with sets of molecules. ^ indicates that the molecule is active.

The figure 2 represent mammalian SNARE map created by studying the wide array of literature. To construct the map, we have assumed that vesicles only contain a single active v-SNARE, and we have attributed t-SNAREs and inactive v-SNAREs that travel between compartments to one of the known vesicles that go between the same source and target compartments. In order to identify the active SNARE complex involved in any particular vesicle fusion, we used two criteria. The SNARE complex is formed in vivo. In most papers, this is determined by immunoprecipitation of the SNARE complex from the relevant cell fraction. Blocking SNARE complex formation (for example, using antibodies against these SNAREs, or using cytosolic forms of these SNAREs) blocks the specific transport step. Note that these vesicles have been collected from multiple cell types, and any given cell type is likely to contain only a subset of the vesicles in the map.

In this figure, the rectangles represent compartments, the identities of compartments are written within ER=endoplasmic reticulum, ERGIC=ER-Golgi intermediate compartment, RE=recycling endosome, EE=early endosome, LE=late endosome, LYS=lysosome, PM=plasma membrane. The arrows represent vesicle edges.

0.b.2 Yeast VTS

GOLGI Q̂a2 Q̂b2 Q̂c2 Q̂a5 Q̂b5 Q̂c5

PRE-VAC Q̂a3 Q̂b2 R̂3 R̂4

VAC Q̂a4 Q̂b2 R̂3 R̂4

PM Q̂a1 Q̂bc1

PM Q̂a6 Q̂b6 R̂6 R̂4

R̂1

R̂1

Q̂c6

R̂1

Q̂c3

Q̂c4
Figure 3: Yeast VTS

In figure 3, we present the yeast VTS. We have borrowed the VTS from [14]. It has been adapted from the paper by separating the v and the t SNAREs. It is clear that it is an incomplete description of the VTS. For example, the inactive molecules were not reported in the reference. We are currently searching for more literature that can help us complete all known information about the VTS.