AutoScale: Optimizing Energy Efficiency of End-to-End Edge Inference under Stochastic Variance

05/06/2020
by   Young Geun Kim, et al.
0

Deep learning inference is increasingly run at the edge. As the programming and system stack support becomes mature, it enables acceleration opportunities within a mobile system, where the system performance envelope is scaled up with a plethora of programmable co-processors. Thus, intelligent services designed for mobile users can choose between running inference on the CPU or any of the co-processors on the mobile system, or exploiting connected systems, such as the cloud or a nearby, locally connected system. By doing so, the services can scale out the performance and increase the energy efficiency of edge mobile systems. This gives rise to a new challenge - deciding when inference should run where. Such execution scaling decision becomes more complicated with the stochastic nature of mobile-cloud execution, where signal strength variations of the wireless networks and resource interference can significantly affect real-time inference performance and system energy efficiency. To enable accurate, energy-efficient deep learning inference at the edge, this paper proposes AutoScale. AutoScale is an adaptive and light-weight execution scaling engine built upon the custom-designed reinforcement learning algorithm. It continuously learns and selects the most energy-efficient inference execution target by taking into account characteristics of neural networks and available systems in the collaborative cloud-edge execution environment while adapting to the stochastic runtime variance. Real system implementation and evaluation, considering realistic execution scenarios, demonstrate an average of 9.8 and 1.6 times energy efficiency improvement for DNN edge inference over the baseline mobile CPU and cloud offloading, while meeting the real-time performance and accuracy requirement.

READ FULL TEXT

page 3

page 5

page 6

page 10

page 11

research
07/16/2021

AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning

Federated learning enables a cluster of decentralized mobile devices at ...
research
06/02/2023

DVFO: Learning-Based DVFS for Energy-Efficient Edge-Cloud Collaborative Inference

Due to limited resources on edge and different characteristics of deep n...
research
08/03/2018

GeneSys: Enabling Continuous Learning through Neural Network Evolution in Hardware

Modern deep learning systems rely on (a) a hand-tuned neural network top...
research
02/21/2022

Online Learning for Orchestration of Inference in Multi-User End-Edge-Cloud Networks

Deep-learning-based intelligent services have become prevalent in cyber-...
research
04/01/2023

GreenScale: Carbon-Aware Systems for Edge Computing

To improve the environmental implications of the growing demand of compu...
research
07/15/2021

An Energy-Efficient Edge Computing Paradigm for Convolution-based Image Upsampling

A novel energy-efficient edge computing paradigm is proposed for real-ti...
research
06/27/2023

Asymptotically Optimal Energy Efficient Offloading Policies in Multi-Access Edge Computing Systems with Task Handover

We study energy-efficient offloading strategies in a large-scale MEC sys...

Please sign up or login with your details

Forgot password? Click here to reset