Enabling Flexibility for Sparse Tensor Acceleration via Heterogeneity

01/21/2022
by   Eric Qin, et al.
0

Recently, numerous sparse hardware accelerators for Deep Neural Networks (DNNs), Graph Neural Networks (GNNs), and scientific computing applications have been proposed. A common characteristic among all of these accelerators is that they target tensor algebra (typically matrix multiplications); yet dozens of new accelerators are proposed for every new application. The motivation is that the size and sparsity of the workloads heavily influence which architecture is best for memory and computation efficiency. To satisfy the growing demand of efficient computations across a spectrum of workloads on large data centers, we propose deploying a flexible 'heterogeneous' accelerator, which contains many 'sub-accelerators' (smaller specialized accelerators) working together. To this end, we propose: (1) HARD TACO, a quick and productive C++ to RTL design flow to generate many types of sub-accelerators for sparse and dense computations for fair design-space exploration, (2) AESPA, a heterogeneous sparse accelerator design template constructed with the sub-accelerators generated from HARD TACO, and (3) a suite of scheduling strategies to map tensor kernels onto heterogeneous sparse accelerators with high efficiency and utilization. AESPA with optimized scheduling achieves 1.96X higher performance, and 7.9X better energy-delay product (EDP) than a Homogeneous EIE-like accelerator with our diverse workload suite.

READ FULL TEXT

page 1

page 4

page 7

page 10

research
05/12/2022

Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling

In recent years, many accelerators have been proposed to efficiently pro...
research
09/13/2019

HERALD: Optimizing Heterogeneous DNN Accelerators for Edge Devices

Recent advances in deep neural networks (DNNs) have made DNNs the backbo...
research
10/08/2021

Pyxis: An Open-Source Performance Dataset of Sparse Accelerators

Specialized accelerators provide gains of performance and efficiency in ...
research
04/28/2021

Domain-specific Genetic Algorithm for Multi-tenant DNNAccelerator Scheduling

As Deep Learning continues to drive a variety of applications in datacen...
research
03/27/2023

Maple: A Processing Element for Row-Wise Product Based Sparse Tensor Accelerators

Sparse tensor computing is a core computational part of numerous applica...
research
11/04/2020

An Empirical-cum-Statistical Approach to Power-Performance Characterization of Concurrent GPU Kernels

Growing deployment of power and energy efficient throughput accelerators...
research
04/17/2023

TeAAL: A Declarative Framework for Modeling Sparse Tensor Accelerators

Over the past few years, the explosion in sparse tensor algebra workload...

Please sign up or login with your details

Forgot password? Click here to reset