Image Classification at Supercomputer Scale

11/16/2018
by   Chris Ying, et al.
0

Deep learning is extremely computationally intensive, and hardware vendors have responded by building faster accelerators in large clusters. Training deep learning models at petaFLOPS scale requires overcoming both algorithmic and systems software challenges. In this paper, we discuss three systems-related optimizations: (1) distributed batch normalization to control per-replica batch sizes, (2) input pipeline optimizations to sustain model throughput, and (3) 2-D torus all-reduce to speed up gradient summation. We combine these optimizations to train ResNet-50 on ImageNet to 76.3 on a 1024-chip TPU v3 Pod with a training throughput of over 1.05 million images/second and no accuracy drop.

READ FULL TEXT
research
10/30/2020

Training EfficientNets at Supercomputer Scale: 83 Accuracy in One Hour

EfficientNets are a family of state-of-the-art image classification mode...
research
02/11/2021

High-Performance Large-Scale Image Recognition Without Normalization

Batch normalization is a key component of most image classification mode...
research
03/08/2023

RAF: Holistic Compilation for Deep Learning Model Training

As deep learning is pervasive in modern applications, many deep learning...
research
03/29/2019

Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds

There has been a strong demand for algorithms that can execute machine l...
research
11/08/2019

The Pitfall of Evaluating Performance on Emerging AI Accelerators

In recent years, domain-specific hardware has brought significant perfor...
research
06/04/2018

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

The deep learning community has proposed optimizations spanning hardware...
research
11/09/2022

RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure

We present RecD (Recommendation Deduplication), a suite of end-to-end in...

Please sign up or login with your details

Forgot password? Click here to reset