ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization

10/15/2019
by   Xiangyi Chen, et al.
16

The adaptive momentum method (AdaMM), which uses past gradients to update descent directions and learning rates simultaneously, has become one of the most popular first-order optimization methods for solving machine learning problems. However, AdaMM is not suited for solving black-box optimization problems, where explicit gradient forms are difficult or infeasible to obtain. In this paper, we propose a zeroth-order AdaMM (ZO-AdaMM) algorithm, that generalizes AdaMM to the gradient-free regime. We show that the convergence rate of ZO-AdaMM for both convex and nonconvex optimization is roughly a factor of O(√(d)) worse than that of the first-order AdaMM algorithm, where d is problem size. In particular, we provide a deep understanding on why Mahalanobis distance matters in convergence of ZO-AdaMM and other AdaMM-type methods. As a byproduct, our analysis makes the first step toward understanding adaptive learning rate methods for nonconvex constrained optimization. Furthermore, we demonstrate two applications, designing per-image and universal adversarial attacks from black-box neural networks, respectively. We perform extensive experiments on ImageNet and empirically show that ZO-AdaMM converges much faster to a solution of high accuracy compared with 6 state-of-the-art ZO optimization methods.

READ FULL TEXT

page 27

page 30

research
05/29/2019

Zeroth-Order Stochastic Alternating Direction Method of Multipliers for Nonconvex Nonsmooth Optimization

Alternating direction method of multipliers (ADMM) is a popular optimiza...
research
08/08/2018

On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

This paper studies a class of adaptive gradient based momentum algorithm...
research
05/19/2018

Nostalgic Adam: Weighing more of the past gradients when designing the adaptive learning rate

First-order optimization methods have been playing a prominent role in d...
research
01/19/2023

A Nonstochastic Control Approach to Optimization

Tuning optimizer hyperparameters, notably the learning rate to a particu...
research
10/14/2022

A Multistep Frank-Wolfe Method

The Frank-Wolfe algorithm has regained much interest in its use in struc...
research
06/02/2020

Sparse Perturbations for Improved Convergence in Stochastic Zeroth-Order Optimization

Interest in stochastic zeroth-order (SZO) methods has recently been revi...
research
07/21/2021

On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms

Zeroth-order (ZO) optimization is widely used to handle challenging task...

Please sign up or login with your details

Forgot password? Click here to reset