Dynamic Resource Partitioning for Multi-Tenant Systolic Array Based DNN Accelerator

02/21/2023
by   Midia Reshadi, et al.
0

Deep neural networks (DNN) have become significant applications in both cloud-server and edge devices. Meanwhile, the growing number of DNNs on those platforms raises the need to execute multiple DNNs on the same device. This paper proposes a dynamic partitioning algorithm to perform concurrent processing of multiple DNNs on a systolic-array-based accelerator. Sharing an accelerator's storage and processing resources across multiple DNNs increases resource utilization and reduces computation time and energy consumption. To this end, we propose a partitioned weight stationary dataflow with a minor modification in the logic of the processing element. We evaluate the energy consumption and computation time with both heavy and light workloads. Simulation results show a 35 and 44 compared with single tenancy.

READ FULL TEXT

page 4

page 6

page 9

research
10/06/2022

Enabling Deep Learning on Edge Devices

Deep neural networks (DNNs) have succeeded in many different perception ...
research
01/06/2022

A Framework for Energy-aware Evaluation of Distributed Data Processing Platforms in Edge-Cloud Environment

Distributed data processing platforms (e.g., Hadoop, Spark, and Flink) a...
research
04/13/2020

Enabling Incremental Knowledge Transfer for Object Detection at the Edge

Object detection using deep neural networks (DNNs) involves a huge amoun...
research
11/07/2022

LOCAL: Low-Complex Mapping Algorithm for Spatial DNN Accelerators

Deep neural networks are a promising solution for applications that solv...
research
06/15/2022

Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading

This paper proposes Mandheling, the first system that enables highly res...
research
10/06/2022

Joint Protection Scheme for Deep Neural Network Hardware Accelerators and Models

Deep neural networks (DNNs) are utilized in numerous image processing, o...
research
02/11/2018

Edge-Host Partitioning of Deep Neural Networks with Feature Space Encoding for Resource-Constrained Internet-of-Things Platforms

This paper introduces partitioning an inference task of a deep neural ne...

Please sign up or login with your details

Forgot password? Click here to reset