A Noisy-Label-Learning Formulation for Immune Repertoire Classification and Disease-Associated Immune Receptor Sequence Identification

07/29/2023
by   Mingcai Chen, et al.
0

Immune repertoire classification, a typical multiple instance learning (MIL) problem, is a frontier research topic in computational biology that makes transformative contributions to new vaccines and immune therapies. However, the traditional instance-space MIL, directly assigning bag-level labels to instances, suffers from the massive amount of noisy labels and extremely low witness rate. In this work, we propose a noisy-label-learning formulation to solve the immune repertoire classification task. To remedy the inaccurate supervision of repertoire-level labels for a sequence-level classifier, we design a robust training strategy: The initial labels are smoothed to be asymmetric and are progressively corrected using the model's predictions throughout the training process. Furthermore, two models with the same architecture but different parameter initialization are co-trained simultaneously to remedy the known "confirmation bias" problem in the self-training-like schema. As a result, we obtain accurate sequence-level classification and, subsequently, repertoire-level classification. Experiments on the Cytomegalovirus (CMV) and Cancer datasets demonstrate our method's effectiveness and superior performance on sequence-level and repertoire-level tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2022

Multiple Instance Learning with Mixed Supervision in Gleason Grading

With the development of computational pathology, deep learning methods f...
research
05/26/2023

Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning

In many real-world tasks, the concerned objects can be represented as a ...
research
10/04/2019

SELF: Learning to Filter Noisy Labels with Self-Ensembling

Deep neural networks (DNNs) have been shown to over-fit a dataset when b...
research
06/09/2020

Dual-stream Maximum Self-attention Multi-instance Learning

Multi-instance learning (MIL) is a form of weakly supervised learning wh...
research
08/16/2021

Weakly Supervised Classification Using Group-Level Labels

In many applications, finding adequate labeled data to train predictive ...
research
02/25/2023

Inaccurate Label Distribution Learning

Label distribution learning (LDL) trains a model to predict the relevanc...
research
05/22/2021

Two-stage Training for Learning from Label Proportions

Learning from label proportions (LLP) aims at learning an instance-level...

Please sign up or login with your details

Forgot password? Click here to reset