Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning

06/02/2022
by   Yuandong Tian, et al.
13

While the empirical success of self-supervised learning (SSL) heavily relies on the usage of deep nonlinear models, many theoretical works proposed to understand SSL still focus on linear ones. In this paper, we study the role of nonlinearity in the training dynamics of contrastive learning (CL) on one and two-layer nonlinear networks with homogeneous activation h(x) = h'(x)x. We theoretically demonstrate that (1) the presence of nonlinearity leads to many local optima even in 1-layer setting, each corresponding to certain patterns from the data distribution, while with linear activation, only one major pattern can be learned; and (2) nonlinearity leads to specialized weights into diverse patterns, a behavior that linear activation is proven not capable of. These findings suggest that models with lots of parameters can be regarded as a brute-force way to find these local optima induced by nonlinearity, a possible underlying reason why empirical observations such as the lottery ticket hypothesis hold. In addition, for 2-layer setting, we also discover global modulation: those local patterns discriminative from the perspective of global-level patterns are prioritized to learn, further characterizing the learning process. Simulation verifies our theoretical findings.

READ FULL TEXT
research
11/27/2022

A Theoretical Study of Inductive Biases in Contrastive Learning

Understanding self-supervised learning is important but challenging. Pre...
research
09/05/2023

Representation Learning Dynamics of Self-Supervised Models

Self-Supervised Learning (SSL) is an important paradigm for learning rep...
research
05/24/2019

On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural Networks

Convolutional neural networks (CNNs) have achieved remarkable performanc...
research
02/17/2021

Contrastive Learning Inverts the Data Generating Process

Contrastive learning has recently seen tremendous success in self-superv...
research
05/21/2023

Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks

The training process of ReLU neural networks often exhibits complicated ...
research
10/01/2020

Understanding Self-supervised Learning with Dual Deep Networks

We propose a novel theoretical framework to understand self-supervised l...

Please sign up or login with your details

Forgot password? Click here to reset