Stochastic Zeroth-order Optimization via Variance Reduction method

by   Liu Liu, et al.
University of California-Davis
The University of Sydney

Derivative-free optimization has become an important technique used in machine learning for optimizing black-box models. To conduct updates without explicitly computing gradient, most current approaches iteratively sample a random search direction from Gaussian distribution and compute the estimated gradient along that direction. However, due to the variance in the search direction, the convergence rates and query complexities of existing methods suffer from a factor of d, where d is the problem dimension. In this paper, we introduce a novel Stochastic Zeroth-order method with Variance Reduction under Gaussian smoothing (SZVR-G) and establish the complexity for optimizing non-convex problems. With variance reduction on both sample space and search space, the complexity of our algorithm is sublinear to d and is strictly better than current approaches, in both smooth and non-smooth cases. Moreover, we extend the proposed method to the mini-batch version. Our experimental results demonstrate the superior performance of the proposed method over existing derivative-free optimization techniques. Furthermore, we successfully apply our method to conduct a universal black-box attack to deep neural networks and present some interesting results.


page 1

page 2

page 3

page 4


Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework

In this work, we focus on the study of stochastic zeroth-order (ZO) opti...

Sparse Perturbations for Improved Convergence in Stochastic Zeroth-Order Optimization

Interest in stochastic zeroth-order (SZO) methods has recently been revi...

Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization

As application demands for zeroth-order (gradient-free) optimization acc...

Stochastically Controlled Stochastic Gradient for the Convex and Non-convex Composition problem

In this paper, we consider the convex and non-convex composition problem...

AdaDGS: An adaptive black-box optimization method with a nonlocal directional Gaussian smoothing gradient

The local gradient points to the direction of the steepest slope in an i...

On the Application of Danskin's Theorem to Derivative-Free Minimax Optimization

Motivated by Danskin's theorem, gradient-based methods have been applied...

A new derivative-free optimization method: Gaussian Crunching Search

Optimization methods are essential in solving complex problems across va...

Please sign up or login with your details

Forgot password? Click here to reset