Resource Allocation for Multiuser Edge Inference with Batching and Early Exiting (Extended Version)

04/11/2022
by   Zhiyan Liu, et al.
0

The deployment of inference services at the network edge, called edge inference, offloads computation-intensive inference tasks from mobile devices to edge servers, thereby enhancing the former's capabilities and battery lives. In a multiuser system, the joint allocation of communication-and-computation (C^2) resources (i.e., scheduling and bandwidth allocation) is made challenging by adopting efficient inference techniques, batching and early exiting, and further complicated by the heterogeneity in users' requirements on accuracy and latency. Batching groups multiple tasks into one batch for parallel processing to reduce time-consuming memory access and thereby boosts the throughput (i.e., completed task per second). On the other hand, early exiting allows a task to exit from a deep-neural network without traversing the whole network to support a tradeoff between accuracy and latency. In this work, we study optimal C^2 resource allocation with batching and early exiting, which is an NP-complete integer program. A set of efficient algorithms are designed under the criterion of maximum throughput by tackling the challenge. Experimental results demonstrate that both optimal and sub-optimal C^2 resource allocation algorithms can leverage integrated batching and early exiting to achieve 200 conventional schemes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2022

Deadline-constrained Multi-resource Task Mapping and Allocation for Edge-Cloud Systems

In an edge-cloud system, mobile devices can offload their computation in...
research
02/07/2018

Joint Task Assignment and Wireless Resource Allocation for Cooperative Mobile-Edge Computing

This paper studies a multi-user cooperative mobile-edge computing (MEC) ...
research
03/10/2020

Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of Partitioned Edge Learning

To leverage data and computation capabilities of mobile devices, machine...
research
07/06/2021

On-edge Multi-task Transfer Learning: Model and Practice with Data-driven Task Allocation

On edge devices, data scarcity occurs as a common problem where transfer...
research
02/21/2021

CFLMEC: Cooperative Federated Learning for Mobile Edge Computing

We investigate a cooperative federated learning framework among devices ...
research
11/08/2022

Integrated Sensing, Computation, and Communication: System Framework and Performance Optimization

Integrated sensing, computation, and communication (ISCC) has been recen...
research
01/13/2020

Fast-Fourier-Forecasting Resource Utilisation in Distributed Systems

Distributed computing systems often consist of hundreds of nodes, execut...

Please sign up or login with your details

Forgot password? Click here to reset