Leveraging Sequence Embedding and Convolutional Neural Network for Protein Function Prediction

12/01/2021
by   Wei-Cheng Tseng, et al.
0

The capability of accurate prediction of protein functions and properties is essential in the biotechnology industry, e.g. drug development and artificial protein synthesis, etc. The main challenges of protein function prediction are the large label space and the lack of labeled training data. Our method leverages unsupervised sequence embedding and the success of deep convolutional neural network to overcome these challenges. In contrast, most of the existing methods delete the rare protein functions to reduce the label space. Furthermore, some existing methods require additional bio-information (e.g., the 3-dimensional structure of the proteins) which is difficult to be determined in biochemical experiments. Our proposed method significantly outperforms the other methods on the publicly available benchmark using only protein sequences as input. This allows the process of identifying protein functions to be sped up.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2018

Visualizing Convolutional Neural Network Protein-Ligand Scoring

Protein-ligand scoring is an important step in a structure-based drug de...
research
02/08/2022

ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on Multiagent Dual-core Learning

Enzyme Commission (EC) numbers, which associate a protein sequence with ...
research
01/18/2023

Beating the Best: Improving on AlphaFold2 at Protein Structure Prediction

The goal of Protein Structure Prediction (PSP) problem is to predict a p...
research
11/03/2016

Multitask Protein Function Prediction Through Task Dissimilarity

Automated protein function prediction is a challenging problem with dist...
research
03/03/2023

Extreme-scale many-against-many protein similarity search

Similarity search is one of the most fundamental computations that are r...
research
03/07/2021

RNA Alternative Splicing Prediction with Discrete Compositional Energy Network

A single gene can encode for different protein versions through a proces...

Please sign up or login with your details

Forgot password? Click here to reset