Optimal Model Placement and Online Model Splitting for Device-Edge Co-Inference

05/28/2021
by   Jia Yan, et al.
7

Device-edge co-inference opens up new possibilities for resource-constrained wireless devices (WDs) to execute deep neural network (DNN)-based applications with heavy computation workloads. In particular, the WD executes the first few layers of the DNN and sends the intermediate features to the edge server that processes the remaining layers of the DNN. By adapting the model splitting decision, there exists a tradeoff between local computation cost and communication overhead. In practice, the DNN model is re-trained and updated periodically at the edge server. Once the DNN parameters are regenerated, part of the updated model must be placed at the WD to facilitate on-device inference. In this paper, we study the joint optimization of the model placement and online model splitting decisions to minimize the energy-and-time cost of device-edge co-inference in presence of wireless channel fading. The problem is challenging because the model placement and model splitting decisions are strongly coupled, while involving two different time scales. We first tackle online model splitting by formulating an optimal stopping problem, where the finite horizon of the problem is determined by the model placement decision. In addition to deriving the optimal model splitting rule based on backward induction, we further investigate a simple one-stage look-ahead rule, for which we are able to obtain analytical expressions of the model splitting decision. The analysis is useful for us to efficiently optimize the model placement decision in a larger time scale. In particular, we obtain a closed-form model placement solution for the fully-connected multilayer perceptron with equal neurons. Simulation results validate the superior performance of the joint optimal model placement and splitting with various DNN structures.

READ FULL TEXT

page 5

page 6

page 12

page 16

page 20

page 26

page 27

page 28

research
10/31/2019

BottleNet++: An End-to-End Approach for Feature Compression in Device-Edge Co-Inference Systems

The emergence of various intelligent mobile applications demands the dep...
research
01/19/2020

Modeling of Deep Neural Network (DNN) Placement and Inference in Edge Computing

With the edge computing becoming an increasingly adopted concept in syst...
research
10/30/2020

Calibration-Aided Edge Inference Offloading via Adaptive Model Partitioning of Deep Neural Networks

Mobile devices can offload deep neural network (DNN)-based inference to ...
research
07/10/2021

Resilient Edge Service Placement and Workload Allocation under Uncertainty

In this paper, we study an optimal service placement and workload alloca...
research
04/23/2021

Unsupervised Information Obfuscation for Split Inference of Neural Networks

Splitting network computations between the edge device and a server enab...
research
04/28/2020

A Stochastic LQR Model for Child Order Placement in Algorithmic Trading

Modern Algorithmic Trading ("Algo") allows institutional investors and t...

Please sign up or login with your details

Forgot password? Click here to reset