To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference

10/21/2018
by   Qing Qin, et al.
14

The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address the computation issue of deep inference on embedded devices. This technique is highly attractive, as it does not rely on specialized hardware, or computation-offloading that is often infeasible due to privacy concerns or high latency. However, it remains unclear how model compression techniques perform across a wide range of DNNs. To design efficient embedded deep learning solutions, we need to understand their behaviors. This work develops a quantitative approach to characterize model compression techniques on a representative embedded deep learning architecture, the NVIDIA Jetson Tx2. We perform extensive experiments by considering 11 influential neural network architectures from the image classification and the natural language processing domains. We experimentally show that how two mainstream compression techniques, data quantization and pruning, perform on these network architectures and the implications of compression techniques to the model storage size, inference time, energy consumption and performance metrics. We demonstrate that there are opportunities to achieve fast deep inference on embedded systems, but one must carefully choose the compression settings. Our results provide insights on when and how to apply model compression techniques and guidelines for designing efficient embedded deep learning systems.

READ FULL TEXT

page 1

page 2

page 4

page 7

research
05/11/2018

Adaptive Selection of Deep Learning Models on Embedded Systems

The recent ground-breaking advances in deep learning networks ( DNNs ) m...
research
11/23/2019

Compressing Representations for Embedded Deep Learning

Despite recent advances in architectures for mobile devices, deep learni...
research
11/09/2019

Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection

Deep neural networks ( DNNs ) are becoming a key enabling technology for...
research
05/20/2019

Compressed Learning of Deep Neural Networks for OpenCL-Capable Embedded Systems

Deep neural networks (DNNs) have been quite successful in solving many c...
research
10/02/2020

GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep Learning

Embedded systems demand on-device processing of data using Neural Networ...
research
06/05/2017

DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework

Recent advances in deep learning motivate the use of deep neutral networ...
research
01/07/2021

Who's a Good Boy? Reinforcing Canine Behavior in Real-Time using Machine Learning

In this paper we outline the development methodology for an automatic do...

Please sign up or login with your details

Forgot password? Click here to reset