Scaling Up Deep Neural Network Optimization for Edge Inference

09/01/2020
by   Bingqian Lu, et al.
10

Deep neural networks (DNNs) have been increasingly deployed on and integrated with edge devices, such as mobile phones, drones, robots and wearables. To run DNN inference directly on edge devices (a.k.a. edge inference) with a satisfactory performance, optimizing the DNN design (e.g., network architecture and quantization policy) is crucial. While state-of-the-art DNN designs have leveraged performance predictors to speed up the optimization process, they are device-specific (i.e., each predictor for only one target device) and hence cannot scale well in the presence of extremely diverse edge devices. Moreover, even with performance predictors, the optimizer (e.g., search-based optimization) can still be time-consuming when optimizing DNNs for many different devices. In this work, we propose a new DNN optimization framework which: (1) leverages scalable performance predictors that can estimate the resulting performance (e.g., inference accuracy/latency/energy) given a DNN-device pair; and (2) uses a neural network-based automated optimizer that takes the device features and optimization parameters as input, and then directly outputs the optimal DNN design without going through a lengthy search process for each individual device.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2018

Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy

As the backbone technology of machine learning, deep neural networks (DN...
research
01/26/2023

PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices

The ability to accurately predict deep neural network (DNN) inference pe...
research
02/29/2020

A Note on Latency Variability of Deep Neural Networks for Mobile Inference

Running deep neural network (DNN) inference on mobile devices, i.e., mob...
research
06/15/2022

Understanding and Optimizing Deep Learning Cold-Start Latency on Edge Devices

DNNs are ubiquitous on edge devices nowadays. With its increasing import...
research
01/15/2022

Moses: Efficient Exploitation of Cross-device Transferable Features for Tensor Program Optimization

Achieving efficient execution of machine learning models has attracted s...
research
07/18/2023

PLiNIO: A User-Friendly Library of Gradient-based Methods for Complexity-aware DNN Optimization

Accurate yet efficient Deep Neural Networks (DNNs) are in high demand, e...
research
11/17/2020

5G Air-to-Ground Network Design and Optimization: A Deep Learning Approach

Direct air-to-ground (A2G) communications leveraging the fifth-generatio...

Please sign up or login with your details

Forgot password? Click here to reset