Machine Learning for Classification of Protein Helix Capping Motifs

05/01/2019
by   Sean Mullane, et al.
0

The biological function of a protein stems from its 3-dimensional structure, which is thermodynamically determined by the energetics of interatomic forces between its amino acid building blocks (the order of amino acids, known as the sequence, defines a protein). Given the costs (time, money, human resources) of determining protein structures via experimental means such as X-ray crystallography, can we better describe and compare protein 3D structures in a robust and efficient manner, so as to gain meaningful biological insights? We begin by considering a relatively simple problem, limiting ourselves to just protein secondary structural elements. Historically, many computational methods have been devised to classify amino acid residues in a protein chain into one of several discrete secondary structures, of which the most well-characterized are the geometrically regular α-helix and β-sheet; irregular structural patterns, such as 'turns' and 'loops', are less understood. Here, we present a study of Deep Learning techniques to classify the loop-like end cap structures which delimit α-helices. Previous work used highly empirical and heuristic methods to manually classify helix capping motifs. Instead, we use structural data directly--including (i) backbone torsion angles computed from 3D structures, (ii) macromolecular feature sets (e.g., physicochemical properties), and (iii) helix cap classification data (from CAPS-DB)--as the ground truth to train a bidirectional long short-term memory (BiLSTM) model to classify helix cap residues. We tried different network architectures and scanned hyperparameters in order to train and assess several models; we also trained a Support Vector Classifier (SVC) to use as a baseline. Ultimately, we achieved 85

READ FULL TEXT
research
07/16/2020

Deep Learning in Protein Structural Modeling and Design

Deep learning is catalyzing a scientific revolution fueled by big data, ...
research
11/09/2019

Accurate Protein Structure Prediction by Embeddings and Deep Learning Representations

Proteins are the major building blocks of life, and actuators of almost ...
research
06/17/2018

MCP: a Multi-Component learning machine to Predict protein secondary structure

The Gene or DNA sequence in every cell does not control genetic properti...
research
04/12/2018

Network-based protein structural classification

Experimental determination of protein function is resource-consuming. As...
research
07/14/2022

Deep Learning Methods for Protein Family Classification on PDB Sequencing Data

Composed of amino acid chains that influence how they fold and thus dict...
research
01/28/2017

Deep Recurrent Neural Network for Protein Function Prediction from Sequence

As high-throughput biological sequencing becomes faster and cheaper, the...
research
11/17/2018

High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures

We tackle the problem of protein secondary structure prediction using a ...

Please sign up or login with your details

Forgot password? Click here to reset