BYOL works even without batch statistics

10/20/2020
by   Pierre H. Richemond, et al.
0

Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach for image representation. From an augmented view of an image, BYOL trains an online network to predict a target network representation of a different augmented view of the same image. Unlike contrastive methods, BYOL does not explicitly use a repulsion term built from negative pairs in its training objective. Yet, it avoids collapse to a trivial, constant representation. Thus, it has recently been hypothesized that batch normalization (BN) is critical to prevent collapse in BYOL. Indeed, BN flows gradients across batch elements, and could leak information about negative views in the batch, which could act as an implicit negative (contrastive) term. However, we experimentally show that replacing BN with a batch-independent normalization scheme (namely, a combination of group normalization and weight standardization) achieves performance comparable to vanilla BYOL (73.9% vs. 74.3% top-1 accuracy under the linear evaluation protocol on ImageNet with ResNet-50). Our finding disproves the hypothesis that the use of batch statistics is a crucial ingredient for BYOL to learn useful representations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2020

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

We introduce Bootstrap Your Own Latent (BYOL), a new approach to self-su...
research
10/05/2020

EqCo: Equivalent Rules for Self-supervised Contrastive Learning

In this paper, we propose a method, named EqCo (Equivalent Rules for Con...
research
03/27/2023

Sigmoid Loss for Language Image Pre-Training

We propose a simple pairwise sigmoid loss for image-text pre-training. U...
research
04/28/2021

A Note on Connecting Barlow Twins with Negative-Sample-Free Contrastive Learning

In this report, we relate the algorithmic design of Barlow Twins' method...
research
08/12/2019

Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise

Batch Normalization (BN) (Ioffe and Szegedy 2015) normalizes the feature...
research
02/24/2022

Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance

In this paper, we study contrastive learning from an optimization perspe...
research
06/07/2021

Making EfficientNet More Efficient: Exploring Batch-Independent Normalization, Group Convolutions and Reduced Resolution Training

Much recent research has been dedicated to improving the efficiency of t...

Please sign up or login with your details

Forgot password? Click here to reset