DeFiNES: Enabling Fast Exploration of the Depth-first Scheduling Space for DNN Accelerators through Analytical Modeling

12/10/2022
by   Linyan Mei, et al.
0

DNN workloads can be scheduled onto DNN accelerators in many different ways: from layer-by-layer scheduling to cross-layer depth-first scheduling (a.k.a. layer fusion, or cascaded execution). This results in a very broad scheduling space, with each schedule leading to varying hardware (HW) costs in terms of energy and latency. To rapidly explore this vast space for a wide variety of hardware architectures, analytical cost models are crucial to estimate scheduling effects on the HW level. However, state-of-the-art cost models are lacking support for exploring the complete depth-first scheduling space, for instance focusing only on activations while ignoring weights, or modeling only DRAM accesses while overlooking on-chip data movements. These limitations prevent researchers from systematically and accurately understanding the depth-first scheduling space. After formalizing this design space, this work proposes a unified modeling framework, DeFiNES, for layer-by-layer and depth-first scheduling to fill in the gaps. DeFiNES enables analytically estimating the hardware cost for possible schedules in terms of both energy and latency, while considering data access at every memory level. This is done for each schedule and HW architecture under study by optimally choosing the active part of the memory hierarchy per unique combination of operand, layer, and feature map tile. The hardware costs are estimated, taking into account both data computation and data copy phases. The analytical cost model is validated against measured data from a taped-out depth-first DNN accelerator, DepFiN, showing good modeling accuracy at the end-to-end neural network level. A comparison with generalized state-of-the-art demonstrates up to 10X better solutions found with DeFiNES.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 7

page 9

page 11

research
05/30/2023

NicePIM: Design Space Exploration for Processing-In-Memory DNN Accelerators with 3D-Stacked-DRAM

With the widespread use of deep neural networks(DNNs) in intelligent sys...
research
02/26/2020

DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures

The recent breakthroughs in deep neural networks (DNNs) have spurred a t...
research
05/02/2022

Pre-RTL DNN Hardware Evaluator With Fused Layer Support

With the popularity of the deep neural network (DNN), hardware accelerat...
research
12/20/2022

Towards Heterogeneous Multi-core Accelerators Exploiting Fine-grained Scheduling of Layer-Fused Deep Neural Networks

To keep up with the ever-growing performance demand of neural networks, ...
research
07/22/2020

ZigZag: A Memory-Centric Rapid DNN Accelerator Design Space Exploration Framework

Building efficient embedded deep learning systems requires a tight co-de...
research
01/26/2022

DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators

Dataflow/mapping decides the compute and energy efficiency of DNN accele...
research
05/07/2021

ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models

With new accelerator hardware for DNN, the computing power for AI applic...

Please sign up or login with your details

Forgot password? Click here to reset