Detailed information about interactions between molecules in biological processes can be an important ingredient for methods that aim to predict cellular behavior in response to perturbations and in deriving mechanistic models of disease processes Khatri et al. (2012). Typical networks models in the biological literature are based on small scale targeted molecular experiments and while having led to a significant corpus of molecular biology knowledge, they are often incomplete and / or not quantitatively predictive. New modeling opportunities arise from modern perturbation screening and profiling techniques that have been used to generate rich biological datasets. These have greatly improved the potential for using machine learning technology to build much more comprehensive predictive interaction models according to what one might call the perturbation-interaction-response paradigm.
Standard computational methods, however, are typically insufficient for accurate and efficient network inference. Static models, e.g., using partial correlation, maximum entropy or mutual information do not use time-resolved information Locasale and Wolf-Yadlin (2009); Chan et al. (2017)2005); Hill et al. (2017)
. Deep learning models, although usually providing good predictive ability, often lack interpretability to help understand the systemZhou and Troyanskaya (2015); Montavon et al. (2018). Recent work of interpretable neural differential equations by Rackauckas et al. (2020) combines the idea of neural ODEs Chen et al. (2018) and physics-based models, providing a potential merge of deep learning and biological network inference for scientific machine learning models.
We have previously developed a hybrid approach, CellBox, that combines explicit ODE models of cellular interactions with an automatic differentiation framework implemented in TensorFlow (Yuan et al. (2020) ). While successful for a specific cancer cell line, CellBox had some key limitations including, (1) we only tested in one specific system; (2) the model was trained only on steady-state data; (3) there exists potential truncation errors during numerical ODE solving due to the use of non-stiff solvers on stiff biological systems, and (4) we used arbitrarily fixed time step, due to the fact that the computational graph is static in TensorFlow.
Here, we implemented the CellBox algorithm in Julia with the aim to move beyond these limitations. We replaced back propagation through time with adjoint sensitivity Fröhlich et al. (2017) and changed Heun’s ODE solver with fixed time steps to high-order ODE solvers with adaptive time steps. Therefore, the new implementation would allow adjoint backward optimization on full trajectory simulation. Using simulated data from both abstract and biology-inspired networks, we demonstrate that the new implementation can accurately predict the dynamic time course of cell response to unseen perturbations. Importantly, this method can efficiently infer molecular interaction parameters with high accuracy for various network structures. These results, taken together, strongly suggest the broad applicability of CellBox in a diversity of systems.
2.1 CellBox model
dimensional vectorrepresents the change in molecular or phenotypic measurement, e.g., log ratios of molecular concentration after and before perturbation , and is the number of molecule types. The initial condition is . quantifies the directional interaction among molecules where each denotes the interaction from network node to node . The scaling factor quantifies the strength of the natural decay term , which is the tendency of an entity to return to its steady state level before perturbation. A hyperbolic tangent envelope function is used to introduce a saturation effect of the interaction term. denotes element wise multiplication.
By integrating Eq. 1 using an ODE solver, we can predict the time evolution of after a given perturbation . CellBox learns these system parameters from experimental data. With given input data, CellBox optimizes and by minimizing the mismatch between model predictions and experimental measurements. Here, instead of experimental data, we provide CellBox time-series pseudo-experimental data, generated from a given network model assumed to be the ground truth. Multiple time points are sampled and noise is added for different perturbation conditions, which collectively form the training data. Once trained, the models can predict the responses to unseen perturbations of the system.
2.2 Adjoint sensitivity methods in dynamic system optimization
The dimension of can be on the order of hundreds and thousands (human cells have about 20,000 genes in total), thus models can have interaction parameters, leading to computational challenges for the parameter inference algorithm. Recently developed neural ODEs Chen et al. (2018); Rubanova et al. (2019) have shown promise in efficiently learning complex dynamic models from time-series data. One can view Eq. 1
as a neural network with single hidden layer where both the dimension of input and output is equal to. We have re-implemented CellBox to learn the model parameters in the framework of neural ODEs. For instance, we denote the model predictions as
And we use the loss function of mean absolute error (MAE), i.e.,
where and correspond to simulated and measured molecular profiles. In this work, we use the ODE solver of the Tsitouras 5/4 Runge-Kutta method, implemented as Tsit5() in the Julia package of DifferentialEquations.jl developed by Rackauckas and Nie (2017). The first-order optimizer Adam proposed in Kingma and Ba (2014) implemented in Flux.jl by Innes (2018) is adopted with a default learning rate of 0.001. Learning rate annealing is exploited. We employ weight decay to encourage sparsity in the interaction matrix (which is expected given our understanding of interactions in biological networks) and the value of weight decay is treated as a hyper-parameter. To accelerate the training, the training proceeds with a specific mini-batching, in which we sample time points from each perturbation condition as a single batch and we iteratively loop over all of the perturbation conditions. Unlike regular deep learning tasks, in addition to tracking the loss function, we also monitor the convergence of the model parameters by inspecting the Pearson correlations between learned interaction weights and ground truth network connections, reported as inference accuracy.
3.1 Abstract synthetic systems
We first conduct proof-of-concept experiments by applying the CellBox algorithm to several typical abstract networks presented in Pratapa et al. (2020), including linear (LI), cyclic (CY), and bifurcation (BI) networks. We artificially introduce perturbations into the systems and use ODE equations to simulate pseudo-experimental response trajectories, from which samples are drawn at multiple time points and used in training. To mimic realistic experimental setups, each synthetic perturbation condition is constrained such that no more than 2 nodes are perturbed simultaneously and at most 5 time points are sampled. We discuss the effect of the number of nodes perturbed and the number of time steps on model performance in Section 4.
We evaluated both the predictive accuracy of cell responses to these perturbations, as well as the accuracy of network inference. The model was trained with as few as 10 perturbation conditions (Table 1). Our results indicated that on these simulated biological systems, the CellBox method could train predictive models of system responses to external perturbations (average MAE ), as well as capture the system structures by inferring the network interaction parameters (average Pearson’s correlation). Overall, while different systems resulted in different efficiency of parameter inference, CellBox could be effectively trained for all four system types tested, when provided with sufficient perturbation conditions. These results suggest the potential applicability of such methods to diverse biological systems.
3.2 Biology-inspired synthetic systems
We tested CellBox on synthetic analogs of four real biological systems adapted from Pratapa et al. (2020), including mammalian cortical area development (mCAD), ventral spinal cord (VSC) development, hematopoietic stem cell (HSC) differentiation and gonadal sex determination (GSD), which were human-curated from the research literature and have been repeatedly validated by experiments. We then used these biology-inspired synthetic system models in simulation and model training in the same way as the synthetic systems described above. Similar to the abstract systems, these models achieved the same level of prediction error (MAE: -) and inference accuracy (Pearson’s correlation: -) (Table 2) and accurately predicted post-perturbational responses in each of the four systems described in Pratapa et al. (2020).
4 Conclusion and discussion
Constructing data-driven models for molecular interactions and cellular behavior of biological systems is useful but challenging. Here we adapted the CellBox method Yuan et al. (2020); Shen et al. (2020), a hybridization of explicit dynamic models and machine learning optimization, with adjoint sensitivity methods in Julia. We tested performance on a diverse range of synthetic network systems, and demonstrate that by simulating perturbation-response data and using such data to construct interaction models, CellBox methods can be effectively trained to predict responses and reconstruct the ground truth interaction network. The overall performance across diverse systems, including several typical abstract networks and multiple biology-inspired networks, suggest that the CellBox can potentially be applied to a wide variety of biological systems.
Another improvement in this work compared to the first version of CellBox is that the previous models used only single endpoint measurement, assuming the system had reached equilibrium. It also obvious that time series data can be more informative for understanding and simulating the system, as explored in (Nyman et al. (2020)). To explore this potential we used simulated time series data for training. The models then are challenged to predict the full time series of the dynamic response to perturbations. This encouraged us to redesign the optimization procedure, e.g. by incorporating NeuralODE and adjoint sensitivity methods from Chen et al. (2018); Fröhlich et al. (2017), as well as to challenge the model training efficiency and robustness, for which more careful fine-tuning are used to ensure stable simulation and modeling.
The data generation procedure used here does not systematically explore what are the minimal or optimal requirement of data for successful training. Some of the models above can be effectively trained with even less data, e.g., only 1 time point measurement; so it is not surprising that some biological datasets that are sparse in the time dimension are still informative. However, we did test scenarios where each synthetic condition has more than 2 nodes perturbed or more than 5 time points measured. As expected, increased information benefits model training and results in better performance even with a smaller sample size. Taken together, these observations might help design the scope of perturbation experiments needed to accurately simulate and predict the responses of a particular system. In future work, exploring associations of experiment size and quality with successful training would be of tremendous help in experimental design and validation.
- Gene regulatory network inference from single-cell data using multivariate information measures. Cell Systems 5 (3), pp. 251–267.e3. External Links: Cited by: §1.
- Neural Ordinary Differential Equations. NeurIPS. External Links: Cited by: §1, §2.2, §4.
Scalable parameter estimation for genome-scale biochemical reaction networks. PLOS Computational Biology 13 (1), pp. 1–18. External Links: Cited by: §1, §4.
- Context Specificity in Causal Signaling Networks Revealed by Phosphoprotein Profiling. Cell Syst 4 (1), pp. 73–83. Cited by: §1.
Flux: elegant machine learning with julia.
Journal of Open Source Software3 (25), pp. 602. Cited by: §2.2.
- Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol 8 (2), pp. e1002375. Cited by: §1.
- Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §2.2.
- Maximum entropy reconstructions of dynamic signaling networks from quantitative proteomics data. PLoS One 4 (8), pp. e6522. Cited by: §1.
- Methods for interpreting and understanding deep neural networks. 73, pp. 1–15. Cited by: §1.
- Perturbation biology links temporal protein changes to drug responses in a melanoma cell line. PLOS Computational Biology 16 (7), pp. 1–26. External Links: Cited by: §4.
- Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nature Methods 17 (2), pp. 147–154. External Links: Cited by: §3.1, §3.2.
- Universal differential equations for scientific machine learning. External Links: Cited by: §1.
- Differentialequations. jl–a performant and feature-rich ecosystem for solving differential equations in julia. Journal of Open Research Software 5 (1). Cited by: §2.2.
- Latent odes for irregularly-sampled time series. arXiv preprint arXiv:1907.03907. Cited by: §2.2.
- Constructing data-driven network models of cell dynamics with perturbation experiments. Learning Meaningful Representations of Life workshop at NeurIPS 2020. Cited by: §4.
- CellBox: interpretable machine learning for perturbation biology with application to the design of cancer combination therapy. Cell Systems. Cited by: §1, §2.1, §4.
- Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods 12 (10), pp. 931–934. Cited by: §1.
- A New Dynamic Bayesian Network (DBN) Approach for Identifying Gene Regulatory Networks From Time Course Microarray Data. Bioinformatics (Oxford, England) 21 (1). External Links: Cited by: §1.