Log In Sign Up

Biophysical models of cis-regulation as interpretable neural networks

The adoption of deep learning techniques in genomics has been hindered by the difficulty of mechanistically interpreting the models that these techniques produce. In recent years, a variety of post-hoc attribution methods have been proposed for addressing this neural network interpretability problem in the context of gene regulation. Here we describe a complementary way of approaching this problem. Our strategy is based on the observation that two large classes of biophysical models of cis-regulatory mechanisms can be expressed as deep neural networks in which nodes and weights have explicit physiochemical interpretations. We also demonstrate how such biophysical networks can be rapidly inferred, using modern deep learning frameworks, from the data produced by certain types of massively parallel reporter assays (MPRAs). These results suggest a scalable strategy for using MPRAs to systematically characterize the biophysical basis of gene regulation in a wide range of biological contexts. They also highlight gene regulation as a promising venue for the development of scientifically interpretable approaches to deep learning.


Incorporating Biological Knowledge with Factor Graph Neural Network for Interpretable Deep Learning

While deep learning has achieved great success in many fields, one commo...

Rank Projection Trees for Multilevel Neural Network Interpretation

A variety of methods have been proposed for interpreting nodes in deep n...

Making Neural Networks Interpretable with Attribution: Application to Implicit Signals Prediction

Explaining recommendations enables users to understand whether recommend...

Warp: a method for neural network interpretability applied to gene expression profiles

We show a proof of principle for warping, a method to interpret the inne...

Towards Better Interpretability in Deep Q-Networks

Deep reinforcement learning techniques have demonstrated superior perfor...

Model Interpretability through the Lens of Computational Complexity

In spite of several claims stating that some models are more interpretab...

Sparse Bottleneck Networks for Exploratory Analysis and Visualization of Neural Patch-seq Data

In recent years, increasingly large datasets with two different sets of ...


  • (1) Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436–444, May 2015.
  • (2) J. Zhou and O. G. Troyanskaya, “Predicting effects of noncoding variants with deep learning–based sequence model,” Nat Methods, vol. 12, pp. 931–934, Aug. 2015.
  • (3) B. Alipanahi, A. Delong, M. T. Weirauch, and B. J. Frey, “Predicting the sequence specificities of dna- and rna-binding proteins by deep learning,” Nat Biotechnol, vol. 33, pp. 831–838, Jul 2015.
  • (4) K. Jaganathan, S. K. Panagiotopoulou, J. F. McRae, S. F. Darbandi, D. Knowles, Y. I. Li, J. A. Kosmicki, J. Arbelaez, W. Cui, G. B. Schwartz, E. D. Chow, E. Kanterakis, H. Gao, A. Kia, S. Batzoglou, S. J. Sanders, and K. K.-H. Farh, “Predicting splicing from primary sequence with deep learning,” Cell, vol. 176, pp. 535–548.e24, Jan. 2019.
  • (5) G. Eraslan, Ž. Avsec, J. Gagneur, and F. J. Theis, “Deep learning: new computational modelling techniques for genomics,” Nat Rev Genet, vol. 20, no. 7, pp. 389–403, 2019.
  • (6) K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv preprint arXiv:1312.6034, 2013.
  • (7) A. Shrikumar, P. Greenside, and A. Kundaje, “Learning important features through propagating activation differences,” in Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, pp. 3145–3153,, 2017.
  • (8) A. Chattopadhyay, P. Manupriya, A. Sarkar, and V. N. Balasubramanian, “Neural network attributions: A causal perspective,” arXiv preprint arXiv:1902.02302, 2019.
  • (9) G. Ackers, A. Johnson, and M. Shea, “Quantitative model for gene regulation by lambda phage repressor,” Proc Natl Acad Sci USA, vol. 79, pp. 1129–1133, Feb. 1982.
  • (10) M. A. Shea and G. K. Ackers, “The or control system of bacteriophage lambda. a physical-chemical model for gene regulation,” J Mol Biol, vol. 181, pp. 211–230, Jan. 1985.
  • (11) L. Bintu, N. E. Buchler, H. G. Garcia, U. Gerland, T. Hwa, J. Kondev, T. Kuhlman, and R. Phillips, “Transcriptional regulation by the numbers: applications,” Curr Opin Genet Dev, vol. 15, pp. 125–135, Apr. 2005.
  • (12) L. Bintu, N. E. Buchler, H. G. Garcia, U. Gerland, T. Hwa, J. Kondev, and R. Phillips, “Transcriptional regulation by the numbers: models,” Curr Opin Genet Dev, vol. 15, pp. 116–124, Apr. 2005.
  • (13) E. Segal and J. Widom, “From dna sequence to transcriptional behaviour: a quantitative approach,” Nat Rev Genet, vol. 10, pp. 443–456, July 2009.
  • (14) M. S. Sherman and B. A. Cohen, “Thermodynamic state ensemble models of cis-regulation,” PLoS Comput Biol, vol. 8, no. 3, p. e1002407, 2012.
  • (15) J. Estrada, F. Wong, A. DePace, and J. Gunawardena, “Information integration and energy expenditure in gene regulation,” Cell, vol. 166, pp. 234–244, June 2016.
  • (16) C. Scholes, A. H. DePace, and A. Sanchez, “Combinatorial gene regulation through kinetic control of the transcription cycle,” Cell Syst, vol. 4, pp. 97–108.e9, Jan. 2017.
  • (17) J. Park, J. Estrada, G. Johnson, B. J. Vincent, C. Ricci-Tam, M. D. Bragdon, Y. Shulgina, A. Cha, Z. Wunderlich, J. Gunawardena, and A. H. DePace, “Dissecting the sharp response of a canonical developmental enhancer reveals multiple sources of cooperativity,” eLife, vol. 8, p. 2787, June 2019.
  • (18) M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., “Tensorflow: A system for large-scale machine learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283, 2016.
  • (19) E. Mjolsness, “On cooperative quasi-equilibrium models of transcriptional regulation,” J Bioinform Comput Biol, vol. 5, no. 2b, pp. 467–490, 2007.
  • (20) A. B. Rosenberg, R. P. Patwardhan, J. Shendure, and G. Seelig, “Learning the sequence determinants of alternative splicing from millions of random sequences,” Cell, vol. 163, pp. 698–711, Oct. 2015.
  • (21) J. T. Cuperus, B. Groves, A. Kuchina, A. B. Rosenberg, N. Jojic, S. Fields, and G. Seelig, “Deep learning of the regulatory grammar of yeast 5’ untranslated regions from 500,000 random sequences,” Genome Res, vol. 27, no. 12, pp. 2015 – 2024, 2017.
  • (22) R. Movva, P. Greenside, G. K. Marinov, S. Nair, A. Shrikumar, and A. Kundaje, “Deciphering regulatory dna sequences and noncoding genetic variants using neural network models of massively parallel reporter assays,” PLoS ONE, vol. 14, no. 6, p. e0218073, 2019.
  • (23) P. J. Sample, B. Wang, D. W. Reid, V. Presnyak, I. J. McFadyen, D. R. Morris, and G. Seelig, “Human 5’ utr design and variant effect prediction from a massively parallel translation assay,” Nat Biotechnol, vol. 37, no. 7, pp. 803–809, 2019.
  • (24) N. Bogard, J. Linder, A. B. Rosenberg, and G. Seelig, “A deep neural network for predicting and engineering alternative polyadenylation,” Cell, vol. 178, no. 1, pp. 91–106.e23, 2019.
  • (25) R. P. Patwardhan, C. Lee, O. Litvin, D. L. Young, D. Pe’er, and J. Shendure, “High-resolution analysis of dna regulatory elements by synthetic saturation mutagenesis,” Nat Biotechnol, vol. 27, no. 12, pp. 1173 – 1175, 2009.
  • (26) J. B. Kinney, A. Murugan, C. G. Callan, and E. C. Cox, “Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence,” Proc Natl Acad Sci USA, vol. 107, pp. 9158–9163, May 2010.
  • (27) A. Melnikov, A. Murugan, X. Zhang, T. Tesileanu, L. Wang, P. Rogov, S. Feizi, A. Gnirke, C. G. Callan, J. B. Kinney, M. Kellis, E. S. Lander, and T. S. Mikkelsen, “Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay,” Nat Biotechnol, vol. 30, pp. 271–277, Feb. 2012.
  • (28) J. C. Kwasnieski, I. Mogno, C. A. Myers, J. C. Corbo, and B. A. Cohen, “Complex effects of nucleotide variants in a mammalian cis-regulatory element,” Proc Natl Acad Sci USA, vol. 109, pp. 19498–19503, Nov. 2012.
  • (29) R. P. Patwardhan, J. B. Hiatt, D. M. Witten, M. J. Kim, R. P. Smith, D. May, C. Lee, J. M. Andrie, S.-I. Lee, G. M. Cooper, N. Ahituv, L. A. Pennacchio, and J. Shendure, “Massively parallel functional dissection of mammalian enhancers in vivo,” Nat Biotechnol, vol. 30, no. 3, pp. 265 – 270, 2012.
  • (30) Y. Liu, K. Barr, and J. Reinitz, “Fully interpretable deep learning model of transcriptional control,” bioRxiv preprint doi:10.1101/655639, May 2019.
  • (31) C. G. d. Boer, E. D. Vaishnav, R. Sadeh, E. L. Abeyta, N. Friedman, and A. Regev, “Deciphering eukaryotic gene-regulatory logic with 100 million random promoters,” Nat Biotechnol, pp. 1–10, 2019.
  • (32) A. Tareen and J. B. Kinney, “Logomaker: beautiful sequence logos in python,” Bioinformatics, Dec. 2019. btz921.
  • (33) M. Razo-Mejia, J. Q. Boedicker, D. Jones, A. DeLuna, J. B. Kinney, and R. Phillips, “Comparison of the theoretical and real-world evolutionary potential of a genetic circuit,” Phys Biol, vol. 11, p. 026005, Apr. 2014.
  • (34) N. M. Belliveau, S. L. Barnes, W. T. Ireland, D. L. Jones, M. J. Sweredoski, A. Moradian, S. Hess, J. B. Kinney, and R. Phillips, “Systematic approach for dissecting the molecular mechanisms of transcriptional regulation in bacteria,” Proc Natl Acad Sci USA, vol. 115, pp. E4796–E4805, May 2018.
  • (35) S. L. Barnes, N. M. Belliveau, W. T. Ireland, J. B. Kinney, and R. Phillips, “Mapping dna sequence to transcription factor binding energy in vivo,” PLoS Comput Biol, vol. 15, p. e1006226, Feb. 2019.
  • (36) J. B. Kinney, G. Tkačik, and C. G. Callan, “Precise physical models of protein–DNA interaction from high-throughput data,” Proc Natl Acad Sci USA, vol. 104, pp. 501–506, Jan. 2007.
  • (37) J. B. Kinney and G. S. Atwal, “Parametric inference in the large data limit using maximally informative models,” Neural Comput, vol. 26, pp. 637–653, Apr. 2014.
  • (38) G. S. Atwal and J. B. Kinney, “Learning quantitative sequence–function relationships from massively parallel experiments,” J Stat Phys, vol. 162, no. 5, pp. 1203–1243, 2016.
  • (39) W. R. McClure, “Rate-limiting steps in rna chain initiation,” Proc Natl Acad Sci USA, vol. 77, no. 10, pp. 5634 – 5638, 1980.
  • (40) W. R. McClure, “Mechanism and control of transcription initiation in prokaryotes,” Annu Rev Biochem, vol. 54, no. 1, pp. 171 – 204, 1985.
  • (41) E. King and C. Altman, “A schematic method of deriving the rate laws for enzyme-catalyzed reactions,” J Phys Chem, vol. 60, no. 10, pp. 1375–1378, 1956.
  • (42) T. L. Hill, Free Energy Transduction and Biochemical Cycle Kinetics. New York: Springer-Verlag, 1989.