AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications

01/28/2021
by   Sicong Liu, et al.
0

There are many deep learning (e.g., DNN) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives. To enable robust and private mobile sensing, DNN tends to be deployed locally on the resource-constrained mobile devices via model compression. The current practice either hand-crafted DNN compression techniques, i.e., for optimizing DNN-relative performance (e.g., parameter size), or on-demand DNN compression methods, i.e., for optimizing hardware-dependent metrics (e.g., latency), cannot be locally online because they require offline retraining to ensure accuracy. Also, none of them have correlated their efforts with runtime adaptive compression to consider the dynamic nature of the deployment context of mobile applications. To address those challenges, we present AdaSpring, a context-adaptive and self-evolutionary DNN compression framework. It enables the runtime adaptive DNN compression locally online. Specifically, it presents the ensemble training of a retraining-free and self-evolutionary network to integrate multiple alternative DNN compression configurations (i.e., compressed architectures and weights). It then introduces the runtime search strategy to quickly search for the most suitable compression configurations and evolve the corresponding weights. With evaluation on five tasks across three platforms and a real-world case study, experiment outcomes show that AdaSpring obtains up to 3.1x latency reduction, 4.2 x energy efficiency improvement in DNNs, compared to hand-crafted compression techniques, while only incurring <= 6.2ms runtime-evolution latency.

READ FULL TEXT
research
06/08/2020

AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles

Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a tremen...
research
02/10/2018

ADC: Automated Deep Compression and Acceleration with Reinforcement Learning

Model compression is an effective technique facilitating the deployment ...
research
03/16/2023

Mobiprox: Supporting Dynamic Approximate Computing on Mobiles

Runtime-tunable context-dependent network compression would make mobile ...
research
02/29/2020

A Note on Latency Variability of Deep Neural Networks for Mobile Inference

Running deep neural network (DNN) inference on mobile devices, i.e., mob...
research
05/08/2021

Incremental Training and Group Convolution Pruning for Runtime DNN Performance Scaling on Heterogeneous Embedded Platforms

Inference for Deep Neural Networks is increasingly being executed locall...
research
11/15/2019

ASCAI: Adaptive Sampling for acquiring Compact AI

This paper introduces ASCAI, a novel adaptive sampling methodology that ...
research
10/31/2019

ALERT: Accurate Anytime Learning for Energy and Timeliness

An increasing number of software applications incorporate runtime Deep N...

Please sign up or login with your details

Forgot password? Click here to reset