Convergence of Alternating Gradient Descent for Matrix Factorization

05/11/2023
by   Rachel Ward, et al.
0

We consider alternating gradient descent (AGD) with fixed step size η > 0, applied to the asymmetric matrix factorization objective. We show that, for a rank-r matrix 𝐀∈ℝ^m × n, T = ( (σ_1(𝐀)/σ_r(𝐀))^2 log(1/ϵ)) iterations of alternating gradient descent suffice to reach an ϵ-optimal factorization 𝐀 - 𝐗_T^𝐘_T^⊺_ F^2 ≤ϵ𝐀_ F^2 with high probability starting from an atypical random initialization. The factors have rank d>r so that 𝐗_T∈ℝ^m × d and 𝐘_T ∈ℝ^n × d. Experiments suggest that our proposed initialization is not merely of theoretical benefit, but rather significantly improves convergence of gradient descent in practice. Our proof is conceptually simple: a uniform PL-inequality and uniform Lipschitz smoothness constant are guaranteed for a sufficient number of iterations, starting from our random initialization. Our proof method should be useful for extending and simplifying convergence analyses for a broader class of nonconvex low-rank factorization problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2023

Fast global convergence of gradient descent for low-rank matrix approximation

This paper investigates gradient descent for solving low-rank matrix app...
research
03/06/2022

Algorithmic Regularization in Model-free Overparametrized Asymmetric Matrix Factorization

We study the asymmetric matrix factorization problem under a natural non...
research
10/21/2019

Adaptive gradient descent without descent

We present a strikingly simple proof that two rules are sufficient to au...
research
10/31/2020

Optimal Sample Complexity of Gradient Descent for Amplitude Flow via Non-Lipschitz Matrix Concentration

We consider the problem of recovering a real-valued n-dimensional signal...
research
12/12/2014

Expanded Alternating Optimization of Nonconvex Functions with Applications to Matrix Factorization and Penalized Regression

We propose a general technique for improving alternating optimization (A...
research
04/25/2022

Randomly Initialized Alternating Least Squares: Fast Convergence for Matrix Sensing

We consider the problem of reconstructing rank-one matrices from random ...
research
11/17/2021

How and When Random Feedback Works: A Case Study of Low-Rank Matrix Factorization

The success of gradient descent in ML and especially for learning neural...

Please sign up or login with your details

Forgot password? Click here to reset