Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction

by   ZiYi Yang, et al.

Current machine learning has made great progress on computer vision and many other fields attributed to the large amount of high-quality training samples, while it does not work very well on genomic data analysis, since they are notoriously known as small data. In our work, we focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients that can guide treatment decisions for a specific individual through training on small data. In fact, doctors and clinicians always address this problem by studying several interrelated clinical variables simultaneously. We attempt to simulate such clinical perspective, and introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks and transfer it to help address new tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification. Observing that gene expression data have specifically high dimensionality and high noise properties compared with image data, we proposed a new extension of it by appending two modules to address these issues. Concretely, we append a feature selection layer to automatically filter out the disease-irrelated genes and incorporate a sample reweighting strategy to adaptively remove noisy data, and meanwhile the extended model is capable of learning from a limited number of training examples and generalize well. Simulations and real gene expression data experiments substantiate the superiority of the proposed method for predicting the subtypes of disease and identifying potential disease-related genes.


page 4

page 5

page 6

page 7

page 8

page 9

page 10

page 11


Complementing Representation Deficiency in Few-shot Image Classification: A Meta-Learning Approach

Few-shot learning is a challenging problem that has attracted more and m...

The TCGA Meta-Dataset Clinical Benchmark

Machine learning is bringing a paradigm shift to healthcare by changing ...

A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data

Gene expression data represents a unique challenge in predictive model b...

Curriculum Meta-Learning for Few-shot Classification

We propose an adaptation of the curriculum training framework, applicabl...

AffinityNet: semi-supervised few-shot learning for disease type prediction

Motivation:While deep learning has achieved great success in computer vi...

Outcome-guided Bayesian Clustering for Disease Subtype Discovery Using High-dimensional Transcriptomic Data

The discovery of disease subtypes is an essential step for developing pr...

Fused inverse-normal method for integrated differential expression analysis of RNA-seq data

Use of next-generation sequencing technologies to transcriptomics (RNA-s...