Understanding Self-supervised Learning with Dual Deep Networks

10/01/2020
by   Yuandong Tian, et al.
52

We propose a novel theoretical framework to understand self-supervised learning methods that employ dual pairs of deep ReLU networks (e.g., SimCLR, BYOL). First, we prove that in each SGD update of SimCLR, the weights at each layer are updated by a covariance operator that specifically amplifies initial random selectivities that vary across data samples but survive averages over data augmentations, which we show leads to the emergence of hierarchical features, if the input data are generated from a hierarchical latent tree model. With the same framework, we also show analytically that BYOL works due to an implicit contrastive term, acting as an approximate covariance operator. The term is formed by the inter-play between the zero-mean operation of BatchNorm and the extra predictor in the online network. Extensive ablation studies justify our theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2021

Understanding self-supervised Learning Dynamics without Contrastive Pairs

Contrastive approaches to self-supervised learning (SSL) learn represent...
research
12/09/2022

Predictor networks and stop-grads provide implicit variance regularization in BYOL/SimSiam

Self-supervised learning (SSL) learns useful representations from unlabe...
research
12/09/2021

Exploring the Equivalence of Siamese Self-Supervised Learning via A Unified Gradient Framework

Self-supervised learning has shown its great potential to extract powerf...
research
04/27/2021

Contrastive Spatial Reasoning on Multi-View Line Drawings

Spatial reasoning on multi-view line drawings by state-of-the-art superv...
research
11/22/2020

Run Away From your Teacher: Understanding BYOL by a Novel Self-Supervised Approach

Recently, a newly proposed self-supervised framework Bootstrap Your Own ...
research
07/20/2022

What Do We Maximize in Self-Supervised Learning?

In this paper, we examine self-supervised learning methods, particularly...
research
06/02/2022

Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning

While the empirical success of self-supervised learning (SSL) heavily re...

Please sign up or login with your details

Forgot password? Click here to reset