DeepAI AI Chat
Log In Sign Up

ASAP-SML: An Antibody Sequence Analysis Pipeline Using Statistical Testing and Machine Learning

03/08/2020
by   Xinmeng Li, et al.
0

Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors. We develop a pipeline, Antibody Sequence Analysis Pipeline using Statistical testing and Machine Learning (ASAP-SML), to identify features that distinguish one set of antibody sequences from antibody sequences in a reference set. The pipeline extracts feature fingerprints from sequences. The fingerprints represent germline, CDR canonical structure, isoelectric point and frequent positional motifs. Machine learning and statistical significance testing techniques are applied to antibody sequences and extracted feature fingerprints to identify distinguishing feature values and combinations thereof. To demonstrate how it works, we applied the pipeline on sets of antibody sequences known to bind or inhibit the activities of matrix metalloproteinases (MMPs), a family of zinc-dependent enzymes that promote cancer progression and undesired inflammation under pathological conditions, against reference datasets that do not bind or inhibit MMPs. ASAP-SML identifies features and combinations of feature values found in the MMP-targeting sets that are distinct from those in the reference sets.

READ FULL TEXT

page 4

page 12

page 20

11/04/2021

Lebesgue Constants For Cantor Sets

We evaluate the values of the Lebesgue constants in polynomial interpola...
08/10/2022

Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine Learning

Inverse design of short single-stranded RNA and DNA sequences (aptamers)...
02/23/2017

Steganalysis of 3D Objects Using Statistics of Local Feature Sets

3D steganalysis aims to identify subtle invisible changes produced in gr...
02/12/2021

End-to-End Intelligent Framework for Rockfall Detection

Rockfall detection is a crucial procedure in the field of geology, which...
07/28/2022

Dive into Machine Learning Algorithms for Influenza Virus Host Prediction with Hemagglutinin Sequences

Influenza viruses mutate rapidly and can pose a threat to public health,...
12/17/2015

Unsupervised Feature Construction for Improving Data Representation and Semantics

Feature-based format is the main data representation format used by mach...