An Efficient Consolidation of Word Embedding and Deep Learning Techniques for Classifying Anticancer Peptides: FastText+BiLSTM

09/21/2023
by   Onur Karakaya, et al.
0

Anticancer peptides (ACPs) are a group of peptides that exhibite antineoplastic properties. The utilization of ACPs in cancer prevention can present a viable substitute for conventional cancer therapeutics, as they possess a higher degree of selectivity and safety. Recent scientific advancements generate an interest in peptide-based therapies which offer the advantage of efficiently treating intended cells without negatively impacting normal cells. However, as the number of peptide sequences continues to increase rapidly, developing a reliable and precise prediction model becomes a challenging task. In this work, our motivation is to advance an efficient model for categorizing anticancer peptides employing the consolidation of word embedding and deep learning models. First, Word2Vec and FastText are evaluated as word embedding techniques for the purpose of extracting peptide sequences. Then, the output of word embedding models are fed into deep learning approaches CNN, LSTM, BiLSTM. To demonstrate the contribution of proposed framework, extensive experiments are carried on widely-used datasets in the literature, ACPs250 and Independent. Experiment results show the usage of proposed model enhances classification accuracy when compared to the state-of-the-art studies. The proposed combination, FastText+BiLSTM, exhibits 92.50 ACPs250 dataset, and 96.15 determining new state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2023

Word Embedding with Neural Probabilistic Prior

To improve word representation learning, we propose a probabilistic prio...
research
12/28/2019

Learning Numeral Embeddings

Word embedding is an essential building block for deep learning methods ...
research
08/21/2018

Gaussian Word Embedding with a Wasserstein Distance Loss

Comparing with word embedding that based on the point representation, di...
research
01/06/2020

Macromolecule Classification Based on the Amino-acid Sequence

Deep learning is playing a vital role in every field which involves data...
research
08/01/2021

Realised Volatility Forecasting: Machine Learning via Financial Word Embedding

We develop FinText, a novel, state-of-the-art, financial word embedding ...
research
10/18/2018

LeukoNet: DCT-based CNN architecture for the classification of normal versus Leukemic blasts in B-ALL Cancer

Acute lymphoblastic leukemia (ALL) constitutes approximately 25 pediatri...
research
07/07/2022

Word Embedding for Social Sciences: An Interdisciplinary Survey

To extract essential information from complex data, computer scientists ...

Please sign up or login with your details

Forgot password? Click here to reset