Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

05/03/2017
by   Minsoo Rhu, et al.
0

Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory. Prior work tries to address this restriction by virtualizing the memory usage of DNNs, enabling both CPU and GPU memory to be utilized for memory allocations. Despite its merits, virtualizing memory can incur significant performance overheads when the time needed to copy data back and forth from CPU memory is higher than the latency to perform the computations required for DNN forward and backward propagation. We introduce a high-performance virtualization strategy based on a "compressing DMA engine" (cDMA) that drastically reduces the size of the data structures that are targeted for CPU-side allocations. The cDMA engine offers an average 2.6x (maximum 13.8x) compression ratio by exploiting the sparsity inherent in offloaded data, improving the performance of virtualized DNNs by an average 32 (maximum 61

READ FULL TEXT

page 3

page 5

page 6

research
02/25/2016

vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design

The most widely used machine learning frameworks require users to carefu...
research
10/08/2022

Demand Layering for Real-Time DNN Inference with Minimized Memory Usage

When executing a deep neural network (DNN), its model parameters are loa...
research
02/08/2023

ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines

Batching has a fundamental influence on the efficiency of deep neural ne...
research
07/31/2018

Cutting Down Training Memory by Re-fowarding

Deep Neutral Networks(DNN) require huge GPU memory when training on mode...
research
04/05/2020

Reducing Data Motion to Accelerate the Training of Deep Neural Networks

This paper reduces the cost of DNNs training by decreasing the amount of...
research
11/22/2019

SparseTrain:Leveraging Dynamic Sparsity in Training DNNs on General-Purpose SIMD Processors

Our community has greatly improved the efficiency of deep learning appli...
research
11/18/2020

A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression

Deep neural networks (DNNs) are becoming increasingly deeper, wider, and...

Please sign up or login with your details

Forgot password? Click here to reset