MCP: a Multi-Component learning machine to Predict protein secondary structure

06/17/2018
by   Leila Khalatbari, et al.
0

The Gene or DNA sequence in every cell does not control genetic properties on its own; Rather, this is done through translation of DNA into protein and subsequent formation of a certain 3D structure. The biological function of a protein is tightly connected to its specific 3D structure. Prediction of the protein secondary structure is a crucial intermediate step towards elucidating its 3D structure and function. Traditional experimental methods for prediction of protein structure are expensive and time-consuming. Therefore, various machine learning approaches have been proposed to predict the protein secondary structure. Nevertheless, the average accuracy of the suggested solutions has hardly reached beyond 80 sequence-structure relation, noise in input protein data, class imbalance, and the high dimensionality of the encoding schemes that represent the protein sequence. In this paper, we propose an accurate multi-component prediction machine to overcome the challenges of protein structure prediction. We devise a multi-component designation to address the high complexity challenge in sequence-structure relation. Furthermore, we utilize a compound string dissimilarity measure to directly interpret protein sequence content and avoid information loss. In order to improve the accuracy, we employ two different classifiers including support vector machine and fuzzy nearest neighbor and collectively aggregate the classification outcomes to infer the final protein secondary structures. We conduct comprehensive experiments to compare our model with the current state-of-the-art approaches. The experimental results demonstrate that given a set of input sequences, our multi-component framework can accurately predict the protein structure. Nevertheless, the effectiveness of our unified model an be further enhanced through framework configuration.

READ FULL TEXT
research
06/17/2018

MCP: a multi-component learning machine for prediction of protein secondary structure

Proteins biological function is tightly connected to its specific 3D str...
research
01/15/2022

StemP: A fast and deterministic Stem-graph approach for RNA and protein folding prediction

We propose a new deterministic methodology to predict RNA sequence and p...
research
08/20/2023

SBSM-Pro: Support Bio-sequence Machine for Proteins

Proteins play a pivotal role in biological systems. The use of machine l...
research
05/01/2019

Machine Learning for Classification of Protein Helix Capping Motifs

The biological function of a protein stems from its 3-dimensional struct...
research
08/11/2023

The divergence time of protein structures modelled by Markov matrices and its relation to the divergence of sequences

A complete time-parameterized statistical model quantifying the divergen...
research
04/04/2022

Multi-Scale Representation Learning on Proteins

Proteins are fundamental biological entities mediating key roles in cell...
research
03/02/2022

FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours

Protein structure prediction is an important method for understanding ge...

Please sign up or login with your details

Forgot password? Click here to reset