pyLEMMINGS: Large Margin Multiple Instance Classification and Ranking for Bioinformatics Applications

11/14/2017
by   Amina Asif, et al.
0

Motivation: A major challenge in the development of machine learning based methods in computational biology is that data may not be accurately labeled due to the time and resources required for experimentally annotating properties of proteins and DNA sequences. Standard supervised learning algorithms assume accurate instance-level labeling of training data. Multiple instance learning is a paradigm for handling such labeling ambiguities. However, the widely used large-margin classification methods for multiple instance learning are heuristic in nature with high computational requirements. In this paper, we present stochastic sub-gradient optimization large margin algorithms for multiple instance classification and ranking, and provide them in a software suite called pyLEMMINGS. Results: We have tested pyLEMMINGS on a number of bioinformatics problems as well as benchmark datasets. pyLEMMINGS has successfully been able to identify functionally important segments of proteins: binding sites in Calmodulin binding proteins, prion forming regions, and amyloid cores. pyLEMMINGS achieves state-of-the-art performance in all these tasks, demonstrating the value of multiple instance learning. Furthermore, our method has shown more than 100-fold improvement in terms of running time as compared to heuristic solutions with improved accuracy over benchmark datasets. Availability and Implementation: pyLEMMINGS python package is available for download at: http://faculty.pieas.edu.pk/fayyaz/software.html#pylemmings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2022

Semi-supervised Learning with Deterministic Labeling and Large Margin Projection

The centrality and diversity of the labeled data are very influential to...
research
05/15/2019

Passage Ranking with Weak Supervsion

In this paper, we propose a weak supervision framework for neural rankin...
research
06/17/2022

Large-Margin Representation Learning for Texture Classification

This paper presents a novel approach combining convolutional layers (CLs...
research
03/02/2017

A Generic Online Parallel Learning Framework for Large Margin Models

To speed up the training process, many existing systems use parallel tec...
research
05/06/2019

An embarrassingly simple approach to neural multiple instance classification

Multiple Instance Learning (MIL) is a weak supervision learning paradigm...
research
10/10/2011

Large-Margin Learning of Submodular Summarization Methods

In this paper, we present a supervised learning approach to training sub...
research
04/26/2022

Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images

Multiple Instance Learning (MIL) methods have become increasingly popula...

Please sign up or login with your details

Forgot password? Click here to reset