DynaMIX: Resource Optimization for DNN-Based Real-Time Applications on a Multi-Tasking System

02/03/2023
by   Minkyoung Cho, et al.
0

As deep neural networks (DNNs) prove their importance and feasibility, more and more DNN-based apps, such as detection and classification of objects, have been developed and deployed on autonomous vehicles (AVs). To meet their growing expectations and requirements, AVs should "optimize" use of their limited onboard computing resources for multiple concurrent in-vehicle apps while satisfying their timing requirements (especially for safety). That is, real-time AV apps should share the limited on-board resources with other concurrent apps without missing their deadlines dictated by the frame rate of a camera that generates and provides input images to the apps. However, most, if not all, of existing DNN solutions focus on enhancing the concurrency of their specific hardware without dynamically optimizing/modifying the DNN apps' resource requirements, subject to the number of running apps, owing to their high computational cost. To mitigate this limitation, we propose DynaMIX (Dynamic MIXed-precision model construction), which optimizes the resource requirement of concurrent apps and aims to maximize execution accuracy. To realize a real-time resource optimization, we formulate an optimization problem using app performance profiles to consider both the accuracy and worst-case latency of each app. We also propose dynamic model reconfiguration by lazy loading only the selected layers at runtime to reduce the overhead of loading the entire model. DynaMIX is evaluated in terms of constraint satisfaction and inference accuracy for a multi-tasking system and compared against state-of-the-art solutions, demonstrating its effectiveness and feasibility under various environmental/operating conditions.

READ FULL TEXT

page 1

page 5

page 10

page 11

research
07/10/2023

Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on Edge GPU

Many applications such as autonomous driving and augmented reality, requ...
research
11/28/2021

Automated Runtime-Aware Scheduling for Multi-Tenant DNN Inference on GPU

With the fast development of deep neural networks (DNNs), many real-worl...
research
05/10/2023

MoCA: Memory-Centric, Adaptive Execution for Multi-Tenant Deep Neural Networks

Driven by the wide adoption of deep neural networks (DNNs) across differ...
research
05/08/2021

Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms

Mobile and embedded platforms are increasingly required to efficiently e...
research
09/13/2021

BERT for Target Apps Selection: Analyzing the Diversity and Performance of BERT in Unified Mobile Search

A unified mobile search framework aims to identify the mobile apps that ...
research
10/31/2019

ALERT: Accurate Learning for Energy and Timeliness

An increasing number of software applications incorporate runtime Deep N...
research
12/16/2019

AppStreamer: Reducing Storage Requirements of Mobile Games through Predictive Streaming

Storage has become a constrained resource on smartphones. Gaming is a po...

Please sign up or login with your details

Forgot password? Click here to reset