DNNShifter: An Efficient DNN Pruning System for Edge Computing

09/13/2023
by   Bailey J. Eccles, et al.
0

Deep neural networks (DNNs) underpin many machine learning applications. Production quality DNN models achieve high inference accuracy by training millions of DNN parameters which has a significant resource footprint. This presents a challenge for resources operating at the extreme edge of the network, such as mobile and embedded devices that have limited computational and memory resources. To address this, models are pruned to create lightweight, more suitable variants for these devices. Existing pruning methods are unable to provide similar quality models compared to their unpruned counterparts without significant time costs and overheads or are limited to offline use cases. Our work rapidly derives suitable model variants while maintaining the accuracy of the original model. The model variants can be swapped quickly when system and network conditions change to match workload demand. This paper presents DNNShifter, an end-to-end DNN training, spatial pruning, and model switching system that addresses the challenges mentioned above. At the heart of DNNShifter is a novel methodology that prunes sparse models using structured pruning. The pruned model variants generated by DNNShifter are smaller in size and thus faster than dense and sparse model predecessors, making them suitable for inference at the edge while retaining near similar accuracy as of the original dense model. DNNShifter generates a portfolio of model variants that can be swiftly interchanged depending on operational conditions. DNNShifter produces pruned model variants up to 93x faster than conventional training methods. Compared to sparse models, the pruned model variants are up to 5.14x smaller and have a 1.67x inference latency speedup, with no compromise to sparse model accuracy. In addition, DNNShifter has up to 11.9x lower overhead for switching models and up to 3.8x lower memory utilisation than existing approaches.

READ FULL TEXT

page 1

page 4

research
03/08/2019

Improving Device-Edge Cooperative Inference of Deep Learning via 2-Step Pruning

Deep neural networks (DNNs) are state-of-the-art solutions for many mach...
research
09/23/2020

Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training

The success of DNN pruning has led to the development of energy-efficien...
research
09/06/2019

PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-time Execution on Mobile Devices

Model compression techniques on Deep Neural Network (DNN) have been wide...
research
06/23/2021

AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks

The increasing computational requirements of deep neural networks (DNNs)...
research
06/12/2020

Dynamic Model Pruning with Feedback

Deep neural networks often have millions of parameters. This can hinder ...
research
06/29/2021

NEUKONFIG: Reducing Edge Service Downtime When Repartitioning DNNs

Deep Neural Networks (DNNs) may be partitioned across the edge and the c...
research
03/13/2020

Edge-Tailored Perception: Fast Inferencing in-the-Edge with Efficient Model Distribution

The rise of deep neural networks (DNNs) is inspiring new studies in myri...

Please sign up or login with your details

Forgot password? Click here to reset