Fix-A-Step: Effective Semi-supervised Learning from Uncurated Unlabeled Sets

08/25/2022
by   Zhe Huang, et al.
1

Semi-supervised learning (SSL) promises gains in accuracy compared to training classifiers on small labeled datasets by also training on many unlabeled images. In realistic applications like medical imaging, unlabeled sets will be collected for expediency and thus uncurated: possibly different from the labeled set in represented classes or class frequencies. Unfortunately, modern deep SSL often makes accuracy worse when given uncurated unlabeled sets. Recent remedies suggest filtering approaches that detect out-of-distribution unlabeled examples and then discard or downweight them. Instead, we view all unlabeled examples as potentially helpful. We introduce a procedure called Fix-A-Step that can improve heldout accuracy of common deep SSL methods despite lack of curation. The key innovations are augmentations of the labeled set inspired by all unlabeled data and a modification of gradient descent updates to prevent following the multi-task SSL loss from hurting labeled-set accuracy. Though our method is simpler than alternatives, we show consistent accuracy gains on CIFAR-10 and CIFAR-100 benchmarks across all tested levels of artificial contamination for the unlabeled sets. We further suggest a real medical benchmark for SSL: recognizing the view type of ultrasound images of the heart. Our method can learn from 353,500 truly uncurated unlabeled images to deliver gains that generalize across hospitals.

READ FULL TEXT

page 7

page 11

page 12

page 13

page 14

research
08/27/2023

Semi-Supervised Learning in the Few-Shot Zero-Shot Scenario

Semi-Supervised Learning (SSL) leverages both labeled and unlabeled data...
research
07/18/2023

Accuracy versus time frontiers of semi-supervised and self-supervised learning on medical images

For many applications of classifiers to medical images, a trustworthy la...
research
07/30/2021

A New Semi-supervised Learning Benchmark for Classifying View and Diagnosing Aortic Stenosis from Echocardiograms

Semi-supervised image classification has shown substantial progress in l...
research
09/22/2018

Semi-Supervised Sequence Modeling with Cross-View Training

Unsupervised representation learning algorithms such as word2vec and ELM...
research
04/06/2019

Split Batch Normalization: Improving Semi-Supervised Learning under Domain Shift

Recent work has shown that using unlabeled data in semi-supervised learn...
research
01/01/2023

Trojaning semi-supervised learning model via poisoning wild images on the web

Wild images on the web are vulnerable to backdoor (also called trojan) p...
research
04/14/2021

A Semi-Supervised Classification Method of Apicomplexan Parasites and Host Cell Using Contrastive Learning Strategy

A common shortfall of supervised learning for medical imaging is the gre...

Please sign up or login with your details

Forgot password? Click here to reset