QAPPA: Quantization-Aware Power, Performance, and Area Modeling of DNN Accelerators

05/17/2022
by   Ahmet Inci, et al.
14

As the machine learning and systems community strives to achieve higher energy-efficiency through custom DNN accelerators and model compression techniques, there is a need for a design space exploration framework that incorporates quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models. In this work, we present QAPPA, a highly parameterized quantization-aware power, performance, and area modeling framework for DNN accelerators. Our framework can facilitate the future research on design space exploration of DNN accelerators for various design choices such as bit precision, processing element type, scratchpad sizes of processing elements, global buffer size, device bandwidth, number of total processing elements in the the design, and DNN workloads. Our results show that different bit precisions and processing element types lead to significant differences in terms of performance per area and energy. Specifically, our proposed lightweight processing elements achieve up to 4.9x more performance per area and energy improvement when compared to INT16 based implementation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2022

QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration

As the machine learning and systems communities strive to achieve higher...
research
08/10/2021

Survey and Benchmarking of Precision-Scalable MAC Arrays for Embedded DNN Processing

Reduced-precision and variable-precision multiply-accumulate (MAC) opera...
research
08/29/2022

An Algorithm-Hardware Co-design Framework to Overcome Imperfections of Mixed-signal DNN Accelerators

In recent years, processing in memory (PIM) based mixedsignal designs ha...
research
08/28/2021

Power-Based Attacks on Spatial DNN Accelerators

With proliferation of DNN-based applications, the confidentiality of DNN...
research
02/22/2023

ALEGO: Towards Cost-Aware Architecture and Integration Co-Design for Chiplet-based Spatial Accelerators

Advanced packaging offers a new design paradigm in the post-Moore era, w...
research
08/30/2022

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

Quantization is a technique to reduce the computation and memory cost of...
research
12/31/2022

BARVINN: Arbitrary Precision DNN Accelerator Controlled by a RISC-V CPU

We present a DNN accelerator that allows inference at arbitrary precisio...

Please sign up or login with your details

Forgot password? Click here to reset