Implicit Balancing and Regularization: Generalization and Convergence Guarantees for Overparameterized Asymmetric Matrix Sensing

03/24/2023
by   Mahdi Soltanolkotabi, et al.
0

Recently, there has been significant progress in understanding the convergence and generalization properties of gradient-based methods for training overparameterized learning models. However, many aspects including the role of small random initialization and how the various parameters of the model are coupled during gradient-based updates to facilitate good generalization remain largely mysterious. A series of recent papers have begun to study this role for non-convex formulations of symmetric Positive Semi-Definite (PSD) matrix sensing problems which involve reconstructing a low-rank PSD matrix from a few linear measurements. The underlying symmetry/PSDness is crucial to existing convergence and generalization guarantees for this problem. In this paper, we study a general overparameterized low-rank matrix sensing problem where one wishes to reconstruct an asymmetric rectangular low-rank matrix from a few linear measurements. We prove that an overparameterized model trained via factorized gradient descent converges to the low-rank matrix generating the measurements. We show that in this setting, factorized gradient descent enjoys two implicit properties: (1) coupling of the trajectory of gradient descent where the factors are coupled in various ways throughout the gradient update trajectory and (2) an algorithmic regularization property where the iterates show a propensity towards low-rank models despite the overparameterized nature of the factorized model. These two implicit properties in turn allow us to show that the gradient descent trajectory from small random initialization moves towards solutions that are both globally optimal and generalize well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2021

Beyond Procrustes: Balancing-Free Gradient Descent for Asymmetric Low-Rank Matrix Sensing

Low-rank matrix estimation plays a central role in various applications ...
research
09/04/2023

Asymmetric matrix sensing by gradient descent with small random initialization

We study matrix sensing, which is the problem of reconstructing a low-ra...
research
03/22/2023

A General Algorithm for Solving Rank-one Matrix Sensing

Matrix sensing has many real-world applications in science and engineeri...
research
02/02/2023

The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing

We propose , a preconditioned gradient descent method to tackle the low-...
research
05/28/2021

Implicit Regularization in Matrix Sensing via Mirror Descent

We study discrete-time mirror descent applied to the unregularized empir...
research
06/16/2022

Gradient Descent for Low-Rank Functions

Several recent empirical studies demonstrate that important machine lear...

Please sign up or login with your details

Forgot password? Click here to reset