Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds

03/29/2019
by   Masafumi Yamazaki, et al.
0

There has been a strong demand for algorithms that can execute machine learning as faster as possible and the speed of deep learning has accelerated by 30 times only in the past two years. Distributed deep learning using the large mini-batch is a key technology to address the demand and is a great challenge as it is difficult to achieve high scalability on large clusters without compromising accuracy. In this paper, we introduce optimization methods which we applied to this challenge. We achieved the training time of 74.7 seconds using 2,048 GPUs on ABCI cluster applying these methods. The training throughput is over 1.73 million images/sec and the top-1 validation accuracy is 75.08

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/13/2018

ImageNet/ResNet-50 Training in 224 Seconds

Scaling the distributed deep learning to a massive GPU cluster level is ...
research
07/30/2018

Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes

Synchronized stochastic gradient descent (SGD) optimizers with data para...
research
11/16/2018

Image Classification at Supercomputer Scale

Deep learning is extremely computationally intensive, and hardware vendo...
research
12/30/2020

Crossover-SGD: A gossip-based communication in distributed deep learning for alleviating large mini-batch problem and enhancing scalability

Distributed deep learning is an effective way to reduce the training tim...
research
06/08/2017

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

Deep learning thrives with large neural networks and large datasets. How...
research
11/12/2017

Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train

For the past 5 years, the ILSVRC competition and the ImageNet dataset ha...
research
10/17/2017

DASHMM Accelerated Adaptive Fast Multipole Poisson-Boltzmann Solver on Distributed Memory Architecture

We present an updated version of the AFMPB package for fast calculation ...

Please sign up or login with your details

Forgot password? Click here to reset