EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD

05/19/2017
by   Mehmet A. Donmez, et al.
0

We present a generic framework for trading off fidelity and cost in computing stochastic gradients when the costs of acquiring stochastic gradients of different quality are not known a priori. We consider a mini-batch oracle that distributes a limited query budget over a number of stochastic gradients and aggregates them to estimate the true gradient. Since the optimal mini-batch size depends on the unknown cost-fidelity function, we propose an algorithm, EE-Grad, that sequentially explores the performance of mini-batch oracles and exploits the accumulated knowledge to estimate the one achieving the best performance in terms of cost-efficiency. We provide performance guarantees for EE-Grad with respect to the optimal mini-batch oracle, and illustrate these results in the case of strongly convex objectives. We also provide a simple numerical example that corroborates our theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2020

Adaptive Learning of the Optimal Mini-Batch Size of SGD

Recent advances in the theoretical understandingof SGD (Qian et al., 201...
research
03/12/2018

Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches

Stochastic neural net weights are used in a variety of contexts, includi...
research
11/06/2017

AdaBatch: Efficient Gradient Aggregation Rules for Sequential and Parallel Stochastic Gradient Methods

We study a new aggregation operator for gradients coming from a mini-bat...
research
10/31/2018

MaSS: an Accelerated Stochastic Method for Over-parametrized Learning

In this paper we introduce MaSS (Momentum-added Stochastic Solver), an a...
research
05/05/2020

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change

The choice of hyper-parameters affects the performance of neural models....
research
04/02/2023

Mini-batch k-means terminates within O(d/ε) iterations

We answer the question: "Does local progress (on batches) imply global p...
research
03/09/2020

Amortized variance reduction for doubly stochastic objectives

Approximate inference in complex probabilistic models such as deep Gauss...

Please sign up or login with your details

Forgot password? Click here to reset