Compact and Computationally Efficient Representation of Deep Neural Networks

05/27/2018
by   Simon Wiedemann, et al.
0

Dot product operations between matrices are at the heart of almost any field in science and technology. In many cases, they are the component that requires the highest computational resources during execution. For instance, deep neural networks such as VGG-16 require up to 15 giga-operations in order to perform the dot products present in a single forward pass, which results in significant energy consumption and thus limits their use in resource-limited environments, e.g., on embedded devices or smartphones. One common approach to reduce the complexity of the inference is to prune and quantize the weight matrices of the neural network and to efficiently represent them using sparse matrix data structures. However, since there is no guarantee that the weight matrices exhibit significant sparsity after quantization, the sparse format may be suboptimal. In this paper we present new efficient data structures for representing matrices with low entropy statistics and show that these formats are especially suitable for representing neural networks. Alike sparse matrix data structures, these formats exploit the statistical properties of the data in order to reduce the size and execution complexity. Moreover, we show that the proposed data structures can not only be regarded as a generalization of sparse formats, but are also more energy and time efficient under practically relevant assumptions. Finally, we test the storage requirements and execution performance of the proposed formats on compressed neural networks and compare them to dense and sparse representations. We experimentally show that we are able to attain up to x15 compression ratios, x1.7 speed ups and x20 energy savings when we lossless convert state-of-the-art networks such as AlexNet, VGG-16, ResNet152 and DenseNet into the new data structures.

READ FULL TEXT

page 9

page 13

research
05/20/2019

Compressed Learning of Deep Neural Networks for OpenCL-Capable Embedded Systems

Deep neural networks (DNNs) have been quite successful in solving many c...
research
06/05/2017

DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework

Recent advances in deep learning motivate the use of deep neutral networ...
research
02/05/2019

Hierarchical Matrix Operations on GPUs: Matrix-Vector Multiplication and Compression

Hierarchical matrices are space and time efficient representations of de...
research
03/29/2021

Deep Compression for PyTorch Model Deployment on Microcontrollers

Neural network deployment on low-cost embedded systems, hence on microco...
research
02/20/2020

Compressed Data Structures for Binary Relations in Practice

Binary relations are commonly used in Computer Science for modeling data...
research
09/28/2018

Throughput Optimizations for FPGA-based Deep Neural Network Inference

Deep neural networks are an extremely successful and widely used techniq...
research
12/02/2021

Putting 3D Spatially Sparse Networks on a Diet

3D neural networks have become prevalent for many 3D vision tasks includ...

Please sign up or login with your details

Forgot password? Click here to reset