Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning

12/17/2020
by   Zhiyuan Li, et al.
0

Matrix factorization is a simple and natural test-bed to investigate the implicit regularization of gradient descent. Gunasekar et al. (2018) conjectured that Gradient Flow with infinitesimal initialization converges to the solution that minimizes the nuclear norm, but a series of recent papers argued that the language of norm minimization is not sufficient to give a full characterization for the implicit regularization. In this work, we provide theoretical and empirical evidence that for depth-2 matrix factorization, gradient flow with infinitesimal initialization is mathematically equivalent to a simple heuristic rank minimization algorithm, Greedy Low-Rank Learning, under some reasonable assumptions. This generalizes the rank minimization view from previous works to a much broader setting and enables us to construct counter-examples to refute the conjecture from Gunasekar et al. (2018). We also extend the results to the case where depth ≥ 3, and we show that the benefit of being deeper is that the above convergence has a much weaker dependence over initialization magnitude so that this rank minimization is more likely to take effect for initialization with practical scale.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2017

Implicit Regularization in Matrix Factorization

We study implicit regularization when optimizing an underdetermined quad...
research
07/02/2021

Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks

Deep linear networks trained with gradient descent yield low rank soluti...
research
12/25/2021

Over-Parametrized Matrix Factorization in the Presence of Spurious Stationary Points

Motivated by the emerging role of interpolating machines in signal proce...
research
05/28/2021

Implicit Regularization in Matrix Sensing via Mirror Descent

We study discrete-time mirror descent applied to the unregularized empir...
research
05/04/2021

Implicit Regularization in Deep Tensor Factorization

Attempts of studying implicit regularization associated to gradient desc...
research
06/30/2021

Deep Linear Networks Dynamics: Low-Rank Biases Induced by Initialization Scale and L2 Regularization

For deep linear networks (DLN), various hyperparameters alter the dynami...
research
01/30/2022

Implicit Regularization Towards Rank Minimization in ReLU Networks

We study the conjectured relationship between the implicit regularizatio...

Please sign up or login with your details

Forgot password? Click here to reset