Communication-Computation Efficient Device-Edge Co-Inference via AutoML

08/30/2021
by   Xinjie Zhang, et al.
0

Device-edge co-inference, which partitions a deep neural network between a resource-constrained mobile device and an edge server, recently emerges as a promising paradigm to support intelligent mobile applications. To accelerate the inference process, on-device model sparsification and intermediate feature compression are regarded as two prominent techniques. However, as the on-device model sparsity level and intermediate feature compression ratio have direct impacts on computation workload and communication overhead respectively, and both of them affect the inference accuracy, finding the optimal values of these hyper-parameters brings a major challenge due to the large search space. In this paper, we endeavor to develop an efficient algorithm to determine these hyper-parameters. By selecting a suitable model split point and a pair of encoder/decoder for the intermediate feature vector, this problem is casted as a sequential decision problem, for which, a novel automated machine learning (AutoML) framework is proposed based on deep reinforcement learning (DRL). Experiment results on an image classification task demonstrate the effectiveness of the proposed framework in achieving a better communication-computation trade-off and significant inference speedup against various baseline schemes.

READ FULL TEXT
research
06/03/2020

Communication-Computation Trade-Off in Resource-Constrained Edge Inference

The recent breakthrough in artificial intelligence (AI), especially deep...
research
10/31/2019

BottleNet++: An End-to-End Approach for Feature Compression in Device-Edge Co-Inference Systems

The emergence of various intelligent mobile applications demands the dep...
research
10/27/2020

Branchy-GNN: a Device-Edge Co-Inference Framework for Efficient Point Cloud Processing

The recent advancements of three-dimensional (3D) data acquisition devic...
research
06/15/2022

Resource-Constrained Edge AI with Early Exit Prediction

By leveraging the data sample diversity, the early-exit network recently...
research
03/16/2022

SC2: Supervised Compression for Split Computing

Split computing distributes the execution of a neural network (e.g., for...
research
09/13/2021

Deep Joint Source-Channel Coding for Multi-Task Network

Multi-task learning (MTL) is an efficient way to improve the performance...
research
11/16/2021

JMSNAS: Joint Model Split and Neural Architecture Search for Learning over Mobile Edge Networks

The main challenge to deploy deep neural network (DNN) over a mobile edg...

Please sign up or login with your details

Forgot password? Click here to reset