USB: A Unified Semi-supervised Learning Benchmark

08/12/2022
by   Yidong Wang, et al.
0

Semi-supervised learning (SSL) improves model generalization by leveraging massive unlabeled data to augment limited labeled samples. However, currently, popular SSL evaluation protocols are often constrained to computer vision (CV) tasks. In addition, previous work typically trains deep neural networks from scratch, which is time-consuming and environmentally unfriendly. To address the above issues, we construct a Unified SSL Benchmark (USB) by selecting 15 diverse, challenging, and comprehensive tasks from CV, natural language processing (NLP), and audio processing (Audio), on which we systematically evaluate dominant SSL methods, and also open-source a modular and extensible codebase for fair evaluation on these SSL methods. We further provide pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning. USB enables the evaluation of a single SSL algorithm on more tasks from multiple domains but with less cost. Specifically, on a single NVIDIA V100, only 37 GPU days are required to evaluate FixMatch on 15 tasks in USB while 335 GPU days (279 GPU days on 4 CV datasets except for ImageNet) are needed on 5 CV tasks with the typical protocol.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2018

Realistic Evaluation of Deep Semi-Supervised Learning Algorithms

Semi-supervised learning (SSL) provides a powerful framework for leverag...
research
08/24/2022

Semi-Supervised and Unsupervised Deep Visual Learning: A Survey

State-of-the-art deep learning models are often trained with a large amo...
research
02/11/2021

SelfHAR: Improving Human Activity Recognition through Self-training with Unlabeled Data

Machine learning and deep learning have shown great promise in mobile se...
research
07/14/2021

Federated Self-Training for Semi-Supervised Audio Recognition

Federated Learning is a distributed machine learning paradigm dealing wi...
research
07/31/2023

Predicting masked tokens in stochastic locations improves masked image modeling

Self-supervised learning is a promising paradigm in deep learning that e...
research
02/16/2023

GLUECons: A Generic Benchmark for Learning Under Constraints

Recent research has shown that integrating domain knowledge into deep le...

Please sign up or login with your details

Forgot password? Click here to reset