Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems

11/18/2018
by   Miguel de Prado, et al.
0

Deep Learning is increasingly being adopted by industry for computer vision applications running on embedded devices. While Convolutional Neural Networks' accuracy has achieved a mature and remarkable state, inference latency and throughput are a major concern especially when targeting low-cost and low-power embedded platforms. CNNs' inference latency may become a bottleneck for Deep Learning adoption by industry, as it is a crucial specification for many real-time processes. Furthermore, deployment of CNNs across heterogeneous platforms presents major compatibility issues due to vendor-specific technology and acceleration libraries. In this work, we present QS-DNN, a fully automatic search based on Reinforcement Learning which, combined with an inference engine optimizer, efficiently explores through the design space and empirically finds the optimal combinations of libraries and primitives to speed up the inference of CNNs on heterogeneous embedded devices. We show that, an optimized combination can achieve 45x speedup in inference latency on CPU compared to a dependency-free baseline and 2x on average on GPGPU compared to the best vendor library. Further, we demonstrate that, the quality of results and time "to-solution" is much better than with Random Search and achieves up to 15x better results for a short-time search.

READ FULL TEXT
research
06/09/2020

Automated Design Space Exploration for optimised Deployment of DNN on Arm Cortex-A CPUs

The spread of deep learning on embedded devices has prompted the develop...
research
06/29/2020

EmotionNet Nano: An Efficient Deep Convolutional Neural Network Design for Real-time Facial Expression Recognition

While recent advances in deep learning have led to significant improveme...
research
01/13/2021

NetCut: Real-Time DNN Inference Using Layer Removal

Deep Learning plays a significant role in assisting humans in many aspec...
research
09/02/2022

Human Activity Recognition on Microcontrollers with Quantized and Adaptive Deep Neural Networks

Human Activity Recognition (HAR) based on inertial data is an increasing...
research
08/15/2018

CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional Network Inference on Video Streams

The last few years have brought advances in computer vision at an amazin...
research
06/15/2023

Audio Tagging on an Embedded Hardware Platform

Convolutional neural networks (CNNs) have exhibited state-of-the-art per...
research
05/13/2021

Adaptive Test-Time Augmentation for Low-Power CPU

Convolutional Neural Networks (ConvNets) are trained offline using the f...

Please sign up or login with your details

Forgot password? Click here to reset