Layer-Parallel Training with GPU Concurrency of Deep Residual Neural Networks via Nonlinear Multigrid

07/14/2020
by   Andrew C. Kirby, et al.
3

A Multigrid Full Approximation Storage algorithm for solving Deep Residual Networks is developed to enable neural network parallelized layer-wise training and concurrent computational kernel execution on GPUs. This work demonstrates a 10.2x speedup over traditional layer-wise model parallelism techniques using the same number of compute units.

READ FULL TEXT

page 2

page 4

page 5

research
12/11/2018

Layer-Parallel Training of Deep Residual Neural Networks

Residual neural networks (ResNets) are a promising class of deep neural ...
research
10/17/2022

PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks

In this paper, we present PARTIME, a software library written in Python ...
research
05/28/2020

Brief Announcement: On the Limits of Parallelizing Convolutional Neural Networks on GPUs

GPUs are currently the platform of choice for training neural networks. ...
research
09/03/2020

Penalty and Augmented Lagrangian Methods for Layer-parallel Training of Residual Networks

Algorithms for training residual networks (ResNets) typically require fo...
research
06/22/2022

Neural Networks as Paths through the Space of Representations

Deep neural networks implement a sequence of layer-by-layer operations t...
research
02/10/2023

Exploiting Sparsity in Pruned Neural Networks to Optimize Large Model Training

Parallel training of neural networks at scale is challenging due to sign...
research
02/01/2022

Deep Layer-wise Networks Have Closed-Form Weights

There is currently a debate within the neuroscience community over the l...

Please sign up or login with your details

Forgot password? Click here to reset