Preconditioned Gradient Descent for Overparameterized Nonconvex Burer–Monteiro Factorization with Global Optimality Certification

06/07/2022
by   Gavin Zhang, et al.
0

We consider using gradient descent to minimize the nonconvex function f(X)=ϕ(XX^T) over an n× r factor matrix X, in which ϕ is an underlying smooth convex cost function defined over n× n matrices. While only a second-order stationary point X can be provably found in reasonable time, if X is additionally rank deficient, then its rank deficiency certifies it as being globally optimal. This way of certifying global optimality necessarily requires the search rank r of the current iterate X to be overparameterized with respect to the rank r^⋆ of the global minimizer X^⋆. Unfortunately, overparameterization significantly slows down the convergence of gradient descent, from a linear rate with r=r^⋆ to a sublinear rate when r>r^⋆, even when ϕ is strongly convex. In this paper, we propose an inexpensive preconditioner that restores the convergence rate of gradient descent back to linear in the overparameterized case, while also making it agnostic to possible ill-conditioning in the global minimizer X^⋆.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2021

Exact Linear Convergence Rate Analysis for Low-Rank Symmetric Matrix Completion via Gradient Descent

Factorization-based gradient descent is a scalable and efficient algorit...
research
07/05/2022

Improved Global Guarantees for the Nonconvex Burer–Monteiro Factorization via Rank Overparameterization

We consider minimizing a twice-differentiable, L-smooth, and μ-strongly ...
research
08/23/2023

Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition

In this work, we establish the linear convergence estimate for the gradi...
research
03/02/2017

The Second Order Linear Model

We study a fundamental class of regression models called the second orde...
research
05/28/2021

STRIDE along Spectrahedral Vertices for Solving Large-Scale Rank-One Semidefinite Relaxations

We consider solving high-order semidefinite programming (SDP) relaxation...
research
08/22/2022

Local Geometry of Nonconvex Spike Deconvolution from Low-Pass Measurements

Spike deconvolution is the problem of recovering the point sources from ...
research
02/23/2022

Globally Convergent Policy Search over Dynamic Filters for Output Estimation

We introduce the first direct policy search algorithm which provably con...

Please sign up or login with your details

Forgot password? Click here to reset