DeepAI AI Chat
Log In Sign Up

Why Do Local Methods Solve Nonconvex Problems?

by   Tengyu Ma, et al.

Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the local geometry and update iteratively. Even though solving non-convex functions is NP-hard in the worst case, the optimization quality in practice is often not an issue – optimizers are largely believed to find approximate global minima. Researchers hypothesize a unified explanation for this intriguing phenomenon: most of the local minima of the practically-used objectives are approximately global minima. We rigorously formalize it for concrete instances of machine learning problems.


page 1

page 2

page 3

page 4


Tackling benign nonconvexity with smoothing and stochastic gradients

Non-convex optimization problems are ubiquitous in machine learning, esp...

Maximin Optimization for Binary Regression

We consider regression problems with binary weights. Such optimization p...

Fighting the curse of dimensionality: A machine learning approach to finding global optima

Finding global optima in high-dimensional optimization problems is extre...

Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

Given a nonconvex function f(x) that is an average of n smooth functions...

Every Local Minimum is a Global Minimum of an Induced Model

For non-convex optimization in machine learning, this paper proves that ...

Git Re-Basin: Merging Models modulo Permutation Symmetries

The success of deep learning is thanks to our ability to solve certain m...

Task Embedded Coordinate Update: A Realizable Framework for Multivariate Non-convex Optimization

We in this paper propose a realizable framework TECU, which embeds task-...