DeepAI
Log In Sign Up

Accelerated Zeroth-Order Momentum Methods from Mini to Minimax Optimization

08/18/2020
by   Feihu Huang, et al.
0

In the paper, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method to solve the non-convex stochastic mini-optimization problems. We prove that the Acc-ZOM method achieves a lower query complexity of O(d^3/4ϵ^-3) for finding an ϵ-stationary point, which improves the best known result by a factor of O(d^1/4) where d denotes the parameter dimension. The Acc-ZOM does not require any batches compared to the large batches required in the existing zeroth-order stochastic algorithms. Further, we extend the Acc-ZOM method to solve the non-convex stochastic minimax-optimization problems and propose an accelerated zeroth-order momentum descent ascent (Acc-ZOMDA) method. We prove that the Acc-ZOMDA method reaches the best know query complexity of Õ(κ_y^3(d_1+d_2)^3/2ϵ^-3) for finding an ϵ-stationary point, where d_1 and d_2 denote dimensions of the mini and max optimization parameters respectively and κ_y is condition number. In particular, our theoretical result does not rely on large batches required in the existing methods. Moreover, we propose a momentum-based accelerated framework for the minimax-optimization problems. At the same time, we present an accelerated momentum descent ascent (Acc-MDA) method for solving the white-box minimax problems, and prove that it achieves the best known gradient complexity of Õ(κ_y^3ϵ^-3) without large batches. Extensive experimental results on the black-box adversarial attack to deep neural networks (DNNs) and poisoning attack demonstrate the efficiency of our algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/13/2020

Gradient Descent Ascent for Min-Max Problems on Riemannian Manifold

In the paper, we study a class of useful non-convex minimax optimization...
06/30/2021

AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization

In the paper, we propose a class of faster adaptive gradient descent asc...
06/23/2021

Bregman Gradient Policy Optimization

In this paper, we design a novel Bregman gradient policy optimization fr...
05/15/2018

On the Application of Danskin's Theorem to Derivative-Free Minimax Optimization

Motivated by Danskin's theorem, gradient-based methods have been applied...
07/26/2021

Enhanced Bilevel Optimization via Bregman Distance

Bilevel optimization has been widely applied many machine learning probl...
06/28/2018

Direct Acceleration of SAGA using Sampled Negative Momentum

Variance reduction is a simple and effective technique that accelerates ...
02/12/2018

Katyusha X: Practical Momentum Method for Stochastic Sum-of-Nonconvex Optimization

The problem of minimizing sum-of-nonconvex functions (i.e., convex funct...