Prediction of Hemolysis Tendency of Peptides using a Reliable Evaluation Method

12/11/2020
by   Ali Raza, et al.
0

There are numerous peptides discovered through past decades, which exhibit antimicrobial and anti-cancerous tendencies. Due to these reasons, peptides are supposed to be sound therapeutic candidates. Some peptides can pose low metabolic stability, high toxicity and high hemolity of peptides. This highlights the importance for evaluating hemolytic tendencies and toxicity of peptides, before using them for therapeutics. Traditional methods for evaluation of toxicity of peptides can be time-consuming and costly. In this study, we have extracted peptides data (Hemo-DB) from Database of Antimicrobial Activity and Structure of Peptides (DBAASP) based on certain hemolity criteria and we present a machine learning based method for prediction of hemolytic tendencies of peptides (i.e. Hemolytic or Non-Hemolytic). Our model offers significant improvement on hemolity prediction benchmarks. we also propose a reliable clustering-based train-tests splitting method which ensures that no peptide in train set is more than 40 this train-test split, we can get reliable estimated of expected model performance on unseen data distribution or newly discovered peptides. Our model tests 0.9986 AUC-ROC (Area Under Receiver Operating Curve) and 97.79 on test set of Hemo-DB using traditional random train-test splitting method. Moreover, our model tests AUC-ROC of 0.997 and Accuracy of 97.58 clustering-based train-test data split. Furthermore, we check our model on an unseen data distribution (at Hemo-PI 3) and we recorded 0.8726 AUC-ROC and 79.5 be screened, which may further in therapeutics and get reliable predictions for unseen amino acids distribution of peptides and newly discovered peptides.

READ FULL TEXT
research
06/23/2022

A Diagnostic Approach to Assess the Quality of Data Splitting in Machine Learning

In machine learning, a routine practice is to split the data into a trai...
research
10/12/2020

How Important is the Train-Validation Split in Meta-Learning?

Meta-learning aims to perform fast adaptation on a new task through lear...
research
04/19/2022

Investigation of a Data Split Strategy Involving the Time Axis in Adverse Event Prediction Using Machine Learning

Adverse events are a serious issue in drug development and many predicti...
research
10/01/2020

Tabular GANs for uneven distribution

GANs are well known for success in the realistic image generation. Howev...
research
01/05/2022

Deep Learning-Based Sparse Whole-Slide Image Analysis for the Diagnosis of Gastric Intestinal Metaplasia

In recent years, deep learning has successfully been applied to automate...
research
01/29/2021

NLPBK at VLSP-2020 shared task: Compose transformer pretrained models for Reliable Intelligence Identification on Social network

This paper describes our method for tuning a transformer-based pretraine...

Please sign up or login with your details

Forgot password? Click here to reset