Edge-Tailored Perception: Fast Inferencing in-the-Edge with Efficient Model Distribution

03/13/2020
by   Ramyad Hadidi, et al.
0

The rise of deep neural networks (DNNs) is inspiring new studies in myriad of edge use cases with robots, autonomous agents, and Internet-of-things (IoT) devices. However, in-the-edge inferencing of DNNs is still a severe challenge mainly because of the contradiction between the inherent intensive resource requirements and the tight resource availability in several edge domains. Further, as communication is costly, taking advantage of other available edge devices is not an effective solution in edge domains. Therefore, to benefit from available compute resources with low communication overhead, we propose new edge-tailored perception (ETP) models that consist of several almost-independent and narrow branches. ETP models offer close-to-minimum communication overheads with better distribution opportunities while significantly reducing memory and computation footprints, all with a trivial accuracy loss for not accuracy-critical tasks. To show the benefits, we deploy ETP models on two real systems, Raspberry Pis and edge-level PYNQ FPGAs. Additionally, we share our insights about tailoring a systolic-based architecture for edge computing with FPGA implementations. ETP models created based on LeNet, CifarNet, VGG-S/16, AlexNet, and ResNets and trained on MNIST, CIFAR10/100, Flower102, and ImageNet, achieve a maximum and average speedups of 56x and 7x, compared to originals. ETP is an addition to existing single-device optimizations for embedded devices by enabling the exploitation of multiple devices. As an example, we show applying pruning and quantization on ETP models improves the average speedup to 33x.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 10

research
10/22/2019

Deep Learning at the Edge

The ever-increasing number of Internet of Things (IoT) devices has creat...
research
01/08/2019

Collaborative Execution of Deep Neural Networks on Internet of Things Devices

With recent advancements in deep neural networks (DNNs), we are able to ...
research
12/08/2020

Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework

Deep Neural Networks (DNNs) have achieved extraordinary performance in v...
research
04/20/2023

ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks

The deployment of AI models on low-power, real-time edge devices require...
research
09/13/2023

DNNShifter: An Efficient DNN Pruning System for Edge Computing

Deep neural networks (DNNs) underpin many machine learning applications....
research
06/07/2022

Decentralized Low-Latency Collaborative Inference via Ensembles on the Edge

The success of deep neural networks (DNNs) is heavily dependent on compu...
research
10/06/2021

ParaDiS: Parallelly Distributable Slimmable Neural Networks

When several limited power devices are available, one of the most effici...

Please sign up or login with your details

Forgot password? Click here to reset