EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets

12/20/2019
by   Matteo Grimaldi, et al.
0

The implementation of Deep Convolutional Neural Networks (ConvNets) on tiny end-nodes with limited non-volatile memory space calls for smart compression strategies capable of shrinking the footprint yet preserving predictive accuracy. There exist a number of strategies for this purpose, from those that play with the topology of the model or the arithmetic precision, e.g. pruning and quantization, to those that operate a model agnostic compression, e.g. weight encoding. The tighter the memory constraint, the higher the probability that these techniques alone cannot meet the requirement, hence more awareness and cooperation across different optimizations become mandatory. This work addresses the issue by introducing EAST, Encoding-Aware Sparse Training, a novel memory-constrained training procedure that leads quantized ConvNets towards deep memory compression. EAST implements an adaptive group pruning designed to maximize the compression rate of the weight encoding scheme (the LZ4 algorithm in this work). If compared to existing methods, EAST meets the memory constraint with lower sparsity, hence ensuring higher accuracy. Results conducted on a state-of-the-art ConvNet (ResNet-9) deployed on a low-power microcontroller (ARM Cortex-M4) validate the proposal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/13/2017

Weightless: Lossy Weight Encoding For Deep Neural Network Compression

The large memory requirements of deep neural networks limit their deploy...
research
11/19/2019

CoopNet: Cooperative Convolutional Neural Network for Low-Power MCUs

Fixed-point quantization and binarization are two reduction methods adop...
research
02/03/2020

Automatic Pruning for Quantized Neural Networks

Neural network quantization and pruning are two techniques commonly used...
research
03/07/2019

Efficient and Effective Quantization for Sparse DNNs

Deep convolutional neural networks (CNNs) are powerful tools for a wide ...
research
02/05/2021

GNN-RL Compression: Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement Learning

Model compression is an essential technique for deploying deep neural ne...
research
09/30/2018

Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters

While deep neural networks are a highly successful model class, their la...
research
11/06/2019

A Programmable Approach to Model Compression

Deep neural networks frequently contain far more weights, represented at...

Please sign up or login with your details

Forgot password? Click here to reset