Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition

03/06/2015
by   Rong Ge, et al.
0

We analyze stochastic gradient descent for optimizing non-convex functions. In many cases for non-convex functions the goal is to find a reasonable local minimum, and the main concern is that gradient updates are trapped in saddle points. In this paper we identify strict saddle property for non-convex problem that allows for efficient optimization. Using this property we show that stochastic gradient descent converges to a local minimum in a polynomial number of iterations. To the best of our knowledge this is the first work that gives global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points. Our analysis can be applied to orthogonal tensor decomposition, which is widely used in learning a rich class of latent variable models. We propose a new optimization formulation for the tensor decomposition problem that has strict saddle property. As a result we get the first online algorithm for orthogonal tensor decomposition with global convergence guarantee.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/05/2021

On the global convergence of randomized coordinate gradient descent for non-convex optimization

In this work, we analyze the global convergence property of coordinate g...
research
02/18/2022

Tackling benign nonconvexity with smoothing and stochastic gradients

Non-convex optimization problems are ubiquitous in machine learning, esp...
research
11/15/2016

The Power of Normalization: Faster Evasion of Saddle Points

A commonly used heuristic in non-convex optimization is Normalized Gradi...
research
01/07/2022

Local and Global Convergence of General Burer-Monteiro Tensor Optimizations

Tensor optimization is crucial to massive machine learning and signal pr...
research
06/18/2017

On the Optimization Landscape of Tensor Decompositions

Non-convex optimization with local search heuristics has been widely use...
research
06/12/2019

Tensor Canonical Correlation Analysis

In many applications, such as classification of images or videos, it is ...
research
07/03/2019

Distributed Learning in Non-Convex Environments – Part II: Polynomial Escape from Saddle-Points

The diffusion strategy for distributed learning from streaming data empl...

Please sign up or login with your details

Forgot password? Click here to reset