Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees

05/06/2023
by   Nachuan Xiao, et al.
0

In this paper, we present a comprehensive study on the convergence properties of Adam-family methods for nonsmooth optimization, especially in the training of nonsmooth neural networks. We introduce a novel two-timescale framework that adopts a two-timescale updating scheme, and prove its convergence properties under mild assumptions. Our proposed framework encompasses various popular Adam-family methods, providing convergence guarantees for these methods in training nonsmooth neural networks. Furthermore, we develop stochastic subgradient methods that incorporate gradient clipping techniques for training nonsmooth neural networks with heavy-tailed noise. Through our framework, we show that our proposed methods converge even when the evaluation noises are only assumed to be integrable. Extensive numerical experiments demonstrate the high efficiency and robustness of our proposed methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2023

Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

In this paper, we investigate the convergence properties of the stochast...
research
03/11/2020

Majorization Minimization Methods to Distributed Pose Graph Optimization with Convergence Guarantees

In this paper, we consider the problem of distributed pose graph optimiz...
research
05/04/2019

NAMSG: An Efficient Method For Training Neural Networks

We introduce NAMSG, an adaptive first-order algorithm for training neura...
research
12/17/2022

Convergence Analysis for Training Stochastic Neural Networks via Stochastic Gradient Descent

In this paper, we carry out numerical analysis to prove convergence of a...
research
03/03/2022

AdaFamily: A family of Adam-like adaptive gradient methods

We propose AdaFamily, a novel method for training deep neural networks. ...
research
05/10/2021

A Bregman Learning Framework for Sparse Neural Networks

We propose a learning framework based on stochastic Bregman iterations t...
research
12/04/2020

Generalized Proximal Methods for Pose Graph Optimization

In this paper, we generalize proximal methods that were originally desig...

Please sign up or login with your details

Forgot password? Click here to reset