A SMART Stochastic Algorithm for Nonconvex Optimization with Applications to Robust Machine Learning

10/04/2016
by   Aleksandr Aravkin, et al.
0

In this paper, we show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set while (2) simultaneously fitting the trimmed model on the uncontaminated data that remains. To solve the resulting nonconvex optimization problem, we introduce a fast stochastic proximal-gradient algorithm that incorporates prior knowledge through nonsmooth regularization. For datasets of size n, our approach requires O(n^2/3/ε) gradient evaluations to reach ε-accuracy and, when a certain error bound holds, the complexity improves to O(κ n^2/3(1/ε)). These rates are n^1/3 times better than those achieved by typical, full gradient methods.

READ FULL TEXT
research
12/22/2021

Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning

Alternating gradient-descent-ascent (AltGDA) is an optimization algorith...
research
01/16/2023

Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization

We consider the optimization problem of the form min_x ∈ℝ^d f(x) ≜𝔼_ξ [F...
research
08/20/2020

An Optimal Hybrid Variance-Reduced Algorithm for Stochastic Composite Nonconvex Optimization

In this note we propose a new variant of the hybrid variance-reduced pro...
research
07/03/2014

Global convergence of splitting methods for nonconvex composite optimization

We consider the problem of minimizing the sum of a smooth function h wit...
research
10/29/2020

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Nonconvex-concave min-max problem arises in many machine learning applic...
research
01/24/2018

Training Set Debugging Using Trusted Items

Training set bugs are flaws in the data that adversely affect machine le...
research
10/03/2020

Expectigrad: Fast Stochastic Optimization with Robust Convergence Properties

Many popular adaptive gradient methods such as Adam and RMSProp rely on ...

Please sign up or login with your details

Forgot password? Click here to reset