Interpretable Structured Learning with Sparse Gated Sequence Encoder for Protein-Protein Interaction Prediction

10/16/2020
by   Kishan KC, et al.
23

Predicting protein-protein interactions (PPIs) by learning informative representations from amino acid sequences is a challenging yet important problem in biology. Although various deep learning models in Siamese architecture have been proposed to model PPIs from sequences, these methods are computationally expensive for a large number of PPIs due to the pairwise encoding process. Furthermore, these methods are difficult to interpret because of non-intuitive mappings from protein sequences to their sequence representation. To address these challenges, we present a novel deep framework to model and predict PPIs from sequence alone. Our model incorporates a bidirectional gated recurrent unit to learn sequence representations by leveraging contextualized and sequential information from sequences. We further employ a sparse regularization to model long-range dependencies between amino acids and to select important amino acids (protein motifs), thus enhancing interpretability. Besides, the novel design of the encoding process makes our model computationally efficient and scalable to an increasing number of interactions. Experimental results on up-to-date interaction datasets demonstrate that our model achieves superior performance compared to other state-of-the-art methods. Literature-based case studies illustrate the ability of our model to provide biological insights to interpret the predictions.

READ FULL TEXT
research
12/09/2017

Variational auto-encoding of protein sequences

Proteins are responsible for the most diverse set of functions in biolog...
research
06/17/2019

rna2rna: Predicting lncRNA-microRNA-mRNA Interactions from Sequence with Integration of Interactome and Biological Annotation Data

Long non-coding RNA, microRNA, and messenger RNA enable key regulations ...
research
11/12/2021

Using Deep Learning Sequence Models to Identify SARS-CoV-2 Divergence

SARS-CoV-2 is an upper respiratory system RNA virus that has caused over...
research
07/18/2020

Deep Learning of High-Order Interactions for Protein Interface Prediction

Protein interactions are important in a broad range of biological proces...
research
07/27/2018

Identifying Protein-Protein Interaction using Tree LSTM and Structured Attention

Identifying interactions between proteins is important to understand und...
research
06/20/2018

DeepAffinity: Interpretable Deep Learning of Compound-Protein Affinity through Unified Recurrent and Convolutional Neural Networks

Motivation: Drug discovery demands rapid quantification of compound-prot...
research
05/27/2022

Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

The ability to accurately model the fitness landscape of protein sequenc...

Please sign up or login with your details

Forgot password? Click here to reset