Distributed Zero-Order Optimization under Adversarial Noise

by   Arya Akhavan, et al.

We study the problem of distributed zero-order optimization for a class of strongly convex functions. They are formed by the average of local objectives, associated to different nodes in a prescribed network of connections. We propose a distributed zero-order projected gradient descent algorithm to solve this problem. Exchange of information within the network is permitted only between neighbouring nodes. A key feature of the algorithm is that it can query only function values, subject to a general noise model, that does not require zero mean or independent errors. We derive upper bounds for the average cumulative regret and optimization error of the algorithm which highlight the role played by a network connectivity parameter, the number of variables, the noise level, the strong convexity parameter of the global objective and certain smoothness properties of the local objectives. When the bound is specified to the standard undistributed setting, we obtain an improvement over the state-of-the-art bounds, due to the novel gradient estimation procedure proposed here. We also comment on lower bounds and observe that the dependency over certain function parameters in the bound is nearly optimal.


page 1

page 2

page 3

page 4


Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits

We study the problem of zero-order optimization of a strongly convex fun...

A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption

We study the problem of optimizing a function under a budgeted number of...

The Minimax Complexity of Distributed Optimization

In this thesis, I study the minimax oracle complexity of distributed sto...

A gradient estimator via L1-randomization for online zero-order optimization with two point feedback

This work studies online zero-order optimization of convex and Lipschitz...

Quantized Decentralized Consensus Optimization

We consider the problem of decentralized consensus optimization, where t...

Multi-Scale Zero-Order Optimization of Smooth Functions in an RKHS

We aim to optimize a black-box function f:XR under the assumption that f...

Information-constrained optimization: can adaptive processing of gradients help?

We revisit first-order optimization under local information constraints ...