DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning

11/23/2021
by   Alex Tamkin, et al.
0

Self-supervised learning algorithms, including BERT and SimCLR, have enabled significant strides in fields like natural language processing, computer vision, and speech processing. However, these algorithms are domain-specific, meaning that new self-supervised learning algorithms must be developed for each new setting, including myriad healthcare, scientific, and multimodal domains. To catalyze progress toward domain-agnostic methods, we introduce DABS: a Domain-Agnostic Benchmark for Self-supervised learning. To perform well on DABS, an algorithm is evaluated on seven diverse domains: natural images, multichannel sensor data, English text, speech recordings, multilingual text, chest x-rays, and images with text descriptions. Each domain contains an unlabeled dataset for pretraining; the model is then is scored based on its downstream performance on a set of labeled tasks in the domain. We also present e-Mix and ShED: two baseline domain-agnostic algorithms; their relatively modest performance demonstrates that significant progress is needed before self-supervised learning is an out-of-the-box solution for arbitrary domains. Code for benchmark datasets and baseline algorithms is available at https://github.com/alextamkin/dabs.

READ FULL TEXT

page 2

page 5

research
04/17/2023

BenchMD: A Benchmark for Modality-Agnostic Learning on Medical Images and Sensors

Medical data poses a daunting challenge for AI algorithms: it exists in ...
research
02/03/2023

SPADE: Self-supervised Pretraining for Acoustic DisEntanglement

Self-supervised representation learning approaches have grown in popular...
research
01/13/2023

A Survey of Self-Supervised Learning from Multiple Perspectives: Algorithms, Theory, Applications and Future Trends

Deep supervised learning algorithms generally require large numbers of l...
research
12/23/2022

Benchmark for Uncertainty Robustness in Self-Supervised Learning

Self-Supervised Learning (SSL) is crucial for real-world applications, e...
research
08/19/2023

Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders

This study explores the application of self-supervised learning (SSL) to...
research
10/25/2021

Self-supervised similarity search for large scientific datasets

We present the use of self-supervised learning to explore and exploit la...
research
08/20/2022

Looking For A Match: Self-supervised Clustering For Automatic Doubt Matching In e-learning Platforms

Recently, e-learning platforms have grown as a place where students can ...

Please sign up or login with your details

Forgot password? Click here to reset