Evaluating Self-Supervised Pretraining Without Using Labels

09/16/2020
by   Colorado Reed, et al.
8

A common practice in unsupervised representation learning is to use labeled data to evaluate the learned representations - oftentimes using the labels from the "unlabeled" training dataset. This supervised evaluation is then used to guide the training process, e.g. to select augmentation policies. However, supervised evaluations may not be possible when labeled data is difficult to obtain (such as medical imaging) or ambiguous to label (such as fashion categorization). This raises the question: is it possible to evaluate unsupervised models without using labeled data? Furthermore, is it possible to use this evaluation to make decisions about the training process, such as which augmentation policies to use? In this work, we show that the simple self-supervised evaluation task of image rotation prediction is highly correlated with the supervised performance of standard visual recognition tasks and datasets (rank correlation > 0.94). We establish this correlation across hundreds of augmentation policies and training schedules and show how this evaluation criteria can be used to automatically select augmentation policies without using labels. Despite not using any labeled data, these policies perform comparably with policies that were determined using supervised downstream tasks. Importantly, this work explores the idea of using unsupervised evaluation criteria to help both researchers and practitioners make decisions when training without labeled data.

READ FULL TEXT
research
11/15/2020

Unsupervised Contrastive Learning of Sound Event Representations

Self-supervised representation learning can mitigate the limitations in ...
research
10/05/2022

RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank

Joint-Embedding Self Supervised Learning (JE-SSL) has seen a rapid devel...
research
06/01/2022

Self-Supervised Learning as a Means To Reduce the Need for Labeled Data in Medical Image Analysis

One of the largest problems in medical image processing is the lack of a...
research
03/02/2023

Evolutionary Augmentation Policy Optimization for Self-supervised Learning

Self-supervised learning (SSL) is a Machine Learning algorithm for pretr...
research
04/11/2023

Self-supervision for medical image classification: state-of-the-art performance with  100 labeled training samples per class

Is self-supervised deep learning (DL) for medical image analysis already...
research
07/22/2019

Semi-Supervised Learning by Disentangling and Self-Ensembling Over Stochastic Latent Space

The success of deep learning in medical imaging is mostly achieved at th...
research
07/21/2021

MG-NET: Leveraging Pseudo-Imaging for Multi-Modal Metagenome Analysis

The emergence of novel pathogens and zoonotic diseases like the SARS-CoV...

Please sign up or login with your details

Forgot password? Click here to reset