DeepAI
Log In Sign Up

AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks

12/06/2017
by   Aditya Devarakonda, et al.
0

Training deep neural networks with Stochastic Gradient Descent, or its variants, requires careful choice of both learning rate and batch size. While smaller batch sizes generally converge in fewer training epochs, larger batch sizes offer more parallelism and hence better computational efficiency. We have developed a new training approach that, rather than statically choosing a single batch size for all epochs, adaptively increases the batch size during the training process. Our method delivers the convergence rate of small batch sizes while achieving performance similar to large batch sizes. We analyse our approach using the standard AlexNet, ResNet, and VGG networks operating on the popular CIFAR-10, CIFAR-100, and ImageNet datasets. Our results demonstrate that learning with adaptive batch sizes can improve performance by factors of up to 6.25 on 4 NVIDIA Tesla P100 GPUs while changing accuracy by less than 1 relative to training with fixed batch sizes.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/20/2018

Revisiting Small Batch Training for Deep Neural Networks

Modern deep neural network training is typically based on mini-batch sto...
12/16/2020

Data optimization for large batch distributed training of deep neural networks

Distributed training in deep learning (DL) is common practice as data an...
11/03/2022

An Adaptive Batch Normalization in Deep Learning

Batch Normalization (BN) is a way to accelerate and stabilize training i...
03/30/2021

Exploiting Invariance in Training Deep Neural Networks

Inspired by two basic mechanisms in animal visual systems, we introduce ...
12/14/2018

An Empirical Model of Large-Batch Training

In an increasing number of domains it has been demonstrated that deep le...
06/07/2018

Fast Distributed Deep Learning via Worker-adaptive Batch Sizing

Deep neural network models are usually trained in cluster environments, ...
07/13/2021

Automated Learning Rate Scheduler for Large-batch Training

Large-batch training has been essential in leveraging large-scale datase...

Code Repositories

AdaBatch

AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks


view repo