ProtoKD: Learning from Extremely Scarce Data for Parasite Ova Recognition

09/18/2023
by   Shubham Trehan, et al.
0

Developing reliable computational frameworks for early parasite detection, particularly at the ova (or egg) stage is crucial for advancing healthcare and effectively managing potential public health crises. While deep learning has significantly assisted human workers in various tasks, its application and diagnostics has been constrained by the need for extensive datasets. The ability to learn from an extremely scarce training dataset, i.e., when fewer than 5 examples per class are present, is essential for scaling deep learning models in biomedical applications where large-scale data collection and annotation can be expensive or not possible (in case of novel or unknown infectious agents). In this study, we introduce ProtoKD, one of the first approaches to tackle the problem of multi-class parasitic ova recognition using extremely scarce data. Combining the principles of prototypical networks and self-distillation, we can learn robust representations from only one sample per class. Furthermore, we establish a new benchmark to drive research in this critical direction and validate that the proposed ProtoKD framework achieves state-of-the-art performance. Additionally, we evaluate the framework's generalizability to other downstream tasks by assessing its performance on a large-scale taxonomic profiling task based on metagenomes sequenced from real-world clinical data.

READ FULL TEXT

page 1

page 3

research
08/31/2022

Few-Shot Learning for Clinical Natural Language Processing Using Siamese Neural Networks

Clinical Natural Language Processing (NLP) has become an emerging techno...
research
03/04/2022

A streamable large-scale clinical EEG dataset for Deep Learning

Deep Learning has revolutionized various fields, including Computer Visi...
research
12/11/2015

Distilling Knowledge from Deep Networks with Applications to Healthcare Domain

Exponential growth in Electronic Healthcare Records (EHR) has resulted i...
research
07/29/2019

MineRL: A Large-Scale Dataset of Minecraft Demonstrations

The sample inefficiency of standard deep reinforcement learning methods ...
research
06/24/2021

Pre-training transformer-based framework on large-scale pediatric claims data for downstream population-specific tasks

The adoption of electronic health records (EHR) has become universal dur...
research
05/17/2023

CWD30: A Comprehensive and Holistic Dataset for Crop Weed Recognition in Precision Agriculture

The growing demand for precision agriculture necessitates efficient and ...
research
12/01/2022

Leveraging Large-scale Multimedia Datasets to Refine Content Moderation Models

The sheer volume of online user-generated content has rendered content m...

Please sign up or login with your details

Forgot password? Click here to reset