Exploration and Exploitation of Unlabeled Data for Open-Set Semi-Supervised Learning

06/30/2023
by   Ganlong Zhao, et al.
0

In this paper, we address a complex but practical scenario in semi-supervised learning (SSL) named open-set SSL, where unlabeled data contain both in-distribution (ID) and out-of-distribution (OOD) samples. Unlike previous methods that only consider ID samples to be useful and aim to filter out OOD ones completely during training, we argue that the exploration and exploitation of both ID and OOD samples can benefit SSL. To support our claim, i) we propose a prototype-based clustering and identification algorithm that explores the inherent similarity and difference among samples at feature level and effectively cluster them around several predefined ID and OOD prototypes, thereby enhancing feature learning and facilitating ID/OOD identification; ii) we propose an importance-based sampling method that exploits the difference in importance of each ID and OOD sample to SSL, thereby reducing the sampling bias and improving the training. Our proposed method achieves state-of-the-art in several challenging benchmarks, and improves upon existing SSL methods even when ID samples are totally absent in unlabeled data.

READ FULL TEXT

page 1

page 3

page 5

page 12

page 13

research
07/22/2020

Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning

Semi-supervised learning (SSL) has been proposed to leverage unlabeled d...
research
08/12/2021

Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning

Open-set semi-supervised learning (open-set SSL) investigates a challeng...
research
08/14/2019

Unsupervised Out-of-Distribution Detection by Maximum Classifier Discrepancy

Since deep learning models have been implemented in many commercial appl...
research
05/29/2023

Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning

Recent advances in robust semi-supervised learning (SSL) typically filte...
research
11/27/2020

They are Not Completely Useless: Towards Recycling Transferable Unlabeled Data for Class-Mismatched Semi-Supervised Learning

Semi-Supervised Learning (SSL) with mismatched classes deals with the pr...
research
08/26/2021

Semantically Coherent Out-of-Distribution Detection

Current out-of-distribution (OOD) detection benchmarks are commonly buil...
research
01/19/2021

On The Consistency Training for Open-Set Semi-Supervised Learning

Conventional semi-supervised learning (SSL) methods, e.g., MixMatch, ach...

Please sign up or login with your details

Forgot password? Click here to reset