Scheduling Optimization Techniques for Neural Network Training

10/03/2021
by   Hyungjun Oh, et al.
0

Neural network training requires a large amount of computation and thus GPUs are often used for the acceleration. While they improve the performance, GPUs are underutilized during the training.This paper proposes out-of-order (ooo) backprop, an effective scheduling technique for neural network training. By exploiting the dependencies of gradient computations, ooo backprop enables to reorder their executions to make the most of the GPU resources. We show that the GPU utilization in single-GPU, data-parallel, and pipeline-parallel training can be commonly improve by applying ooo back-prop and prioritizing critical operations. We propose three scheduling algorithms based on ooo backprop. For single-GPU training, we schedule with multi-stream out-of-order computation to mask the kernel launch overhead. In data-parallel training, we reorder the gradient computations to maximize the overlapping of computation and parameter communication; in pipeline-parallel training, we prioritize critical gradient computations to reduce the pipeline stalls.We evaluate our optimizations with twelve neural networks including a light-weight computer vision model (MobileNet) and largeNLP models (BERT and GPT-3) with up to forty eight V100 GPUs.Our scheduling algorithms effectively improve the performance of single-GPU training as well as data- and pipeline-parallel training.Compared to the respective state of the art training systems, the throughput is substantially improved for single-GPU, data-parallel, and pipeline-parallel training.

READ FULL TEXT
research
05/22/2023

Communication-minimizing Asynchronous Tensor Parallelism

As state-of-the-art neural networks scale to billions of parameters, des...
research
12/04/2020

Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning

Deep learning (DL) frameworks take advantage of GPUs to improve the spee...
research
05/07/2021

Apply Artificial Neural Network to Solving Manpower Scheduling Problem

The manpower scheduling problem is a kind of critical combinational opti...
research
07/12/2019

Faster Neural Network Training with Data Echoing

In the twilight of Moore's law, GPUs and other specialized hardware acce...
research
10/03/2018

Exascale Deep Learning for Climate Analytics

We extract pixel-level masks of extreme weather patterns using variants ...
research
10/17/2022

PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks

In this paper, we present PARTIME, a software library written in Python ...

Please sign up or login with your details

Forgot password? Click here to reset