Brief Announcement: On the Limits of Parallelizing Convolutional Neural Networks on GPUs

05/28/2020
by   Behnam Pourghassemi, et al.
0

GPUs are currently the platform of choice for training neural networks. However, training a deep neural network (DNN) is a time-consuming process even on GPUs because of the massive number of parameters that have to be learned. As a result, accelerating DNN training has been an area of significant research in the last couple of years. While earlier networks such as AlexNet had a linear dependency between layers and operations, state-of-the-art networks such as ResNet, PathNet, and GoogleNet have a non-linear structure that exhibits a higher level of inter-operation parallelism. However, popular deep learning (DL) frameworks such as TensorFlow and PyTorch launch the majority of neural network operations, especially convolutions, serially on GPUs and do not exploit this inter-op parallelism. In this brief announcement, we make a case for the need and potential benefit of exploiting this rich parallelism in state-of-the-art non-linear networks for reducing the training time. We identify the challenges and limitations in enabling concurrent layer execution on GPU backends (such as cuDNN) of DL frameworks and propose potential solutions.

READ FULL TEXT

page 1

page 2

page 3

research
09/08/2018

Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform

The training process of Deep Neural Network (DNN) is compute-intensive, ...
research
07/14/2020

Layer-Parallel Training with GPU Concurrency of Deep Residual Neural Networks via Nonlinear Multigrid

A Multigrid Full Approximation Storage algorithm for solving Deep Residu...
research
01/15/2018

Leapfrogging for parallelism in deep neural networks

We present a technique, which we term leapfrogging, to parallelize back-...
research
03/15/2023

MCR-DL: Mix-and-Match Communication Runtime for Deep Learning

In recent years, the training requirements of many state-of-the-art Deep...
research
02/21/2022

Enabling On-Device Smartphone GPU based Training: Lessons Learned

Deep Learning (DL) has shown impressive performance in many mobile appli...
research
04/07/2016

Optimizing Performance of Recurrent Neural Networks on GPUs

As recurrent neural networks become larger and deeper, training times fo...

Please sign up or login with your details

Forgot password? Click here to reset