Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference

06/08/2023
by   Matteo Risso, et al.
0

The need to execute Deep Neural Networks (DNNs) at low latency and low power at the edge has spurred the development of new heterogeneous Systems-on-Chips (SoCs) encapsulating a diverse set of hardware accelerators. How to optimally map a DNN onto such multi-accelerator systems is an open problem. We propose ODiMO, a hardware-aware tool that performs a fine-grain mapping across different accelerators on-chip, splitting individual layers and executing them in parallel, to reduce inference energy consumption or latency, while taking into account each accelerator's quantization precision to maintain accuracy. Pareto-optimal networks in the accuracy vs. energy or latency space are pursued for three popular dataset/DNN pairs, and deployed on the DIANA heterogeneous ultra-low power edge AI SoC. We show that ODiMO reduces energy/latency by up to 33 mappings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2021

3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration

The deep neural network (DNN) based AI applications on the edge require ...
research
06/25/2022

Heterogeneous Multi-core Array-based DNN Accelerator

In this article, we investigate the impact of architectural parameters o...
research
04/08/2023

Arithmetic Intensity Balancing Convolution for Hardware-aware Efficient Block Design

As deep learning advances, edge devices and lightweight neural networks ...
research
03/12/2019

Low Power Inference for On-Device Visual Recognition with a Quantization-Friendly Solution

The IEEE Low-Power Image Recognition Challenge (LPIRC) is an annual comp...
research
04/20/2023

ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks

The deployment of AI models on low-power, real-time edge devices require...
research
10/26/2022

Multi-Objective Hardware-Mapping Co-Optimisation for Multi-Tenant DNN Accelerators

To meet the ever-increasing computation demand from emerging workloads, ...
research
10/12/2020

DESCNet: Developing Efficient Scratchpad Memories for Capsule Network Hardware

Deep Neural Networks (DNNs) have been established as the state-of-the-ar...

Please sign up or login with your details

Forgot password? Click here to reset