LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning

06/16/2023
by   Jifan Zhang, et al.
0

Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive. To mitigate this cost, machine learning methods, such as transfer learning, semi-supervised learning and active learning, aim to be label-efficient: achieving high predictive performance from relatively few labeled examples. While obtaining the best label-efficiency in practice often requires combinations of these techniques, existing benchmark and evaluation frameworks do not capture a concerted combination of all such techniques. This paper addresses this deficiency by introducing LabelBench, a new computationally-efficient framework for joint evaluation of multiple label-efficient learning techniques. As an application of LabelBench, we introduce a novel benchmark of state-of-the-art active learning methods in combination with semi-supervised learning for fine-tuning pretrained vision transformers. Our benchmark demonstrates better label-efficiencies than previously reported in active learning. LabelBench's modular codebase is open-sourced for the broader community to contribute label-efficient learning methods and benchmarks. The repository can be found at: https://github.com/EfficientTraining/LabelBench.

READ FULL TEXT
research
10/16/2022

Semantic Segmentation with Active Semi-Supervised Representation Learning

Obtaining human per-pixel labels for semantic segmentation is incredibly...
research
06/15/2023

Re-Benchmarking Pool-Based Active Learning for Binary Classification

Active learning is a paradigm that significantly enhances the performanc...
research
12/02/2019

Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels

We propose using active learning based techniques to further improve the...
research
03/29/2020

A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

Entity Matching (EM) is a core data cleaning task, aiming to identify di...
research
03/28/2023

Automated wildlife image classification: An active learning tool for ecological applications

Wildlife camera trap images are being used extensively to investigate an...
research
12/26/2022

Online Active Learning for Soft Sensor Development using Semi-Supervised Autoencoders

Data-driven soft sensors are extensively used in industrial and chemical...
research
09/21/2022

Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning

Annotating abusive language is expensive, logistically complex and creat...

Please sign up or login with your details

Forgot password? Click here to reset