Fast Global Convergence via Landscape of Empirical Loss

02/13/2018
by   Chao Qu, et al.
0

While optimizing convex objective (loss) functions has been a powerhouse for machine learning for at least two decades, non-convex loss functions have attracted fast growing interests recently, due to many desirable properties such as superior robustness and classification accuracy, compared with their convex counterparts. The main obstacle for non-convex estimators is that it is in general intractable to find the optimal solution. In this paper, we study the computational issues for some non-convex M-estimators. In particular, we show that the stochastic variance reduction methods converge to the global optimal with linear rate, by exploiting the statistical property of the population loss. En route, we improve the convergence analysis for the batch gradient method in mei2016landscape.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2022

Convergence Error Analysis of Reflected Gradient Langevin Dynamics for Globally Optimizing Non-Convex Constrained Problems

Non-convex optimization problems have various important applications, wh...
research
10/27/2017

Stochastic Conjugate Gradient Algorithm with Variance Reduction

Conjugate gradient methods are a class of important methods for solving ...
research
01/30/2022

Faster Convergence of Local SGD for Over-Parameterized Models

Modern machine learning architectures are often highly expressive. They ...
research
02/28/2020

First Order Methods take Exponential Time to Converge to Global Minimizers of Non-Convex Functions

Machine learning algorithms typically perform optimization over a class ...
research
08/30/2022

Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method for Empirical Risk Minimization

The Frank-Wolfe method has become increasingly useful in statistical and...
research
09/09/2015

Sensor Selection by Linear Programming

We learn sensor trees from training data to minimize sensor acquisition ...
research
03/22/2020

Efficient Clustering for Stretched Mixtures: Landscape and Optimality

This paper considers a canonical clustering problem where one receives u...

Please sign up or login with your details

Forgot password? Click here to reset