TensorLib: A Spatial Accelerator Generation Framework for Tensor Algebra

04/26/2021
by   Liancheng Jia, et al.
0

Tensor algebra finds applications in various domains, and these applications, especially when accelerated on spatial hardware accelerators, can deliver high performance and low power. Spatial hardware accelerator exhibits complex design space. Prior approaches based on manual implementation lead to low programming productivity, rendering thorough design space exploration impossible. In this paper, we propose TensorLib, a framework for generating spatial hardware accelerator for tensor algebra applications. TensorLib is motivated by the observation that, different dataflows share common hardware modules, which can be reused across different designs. To build such a framework, TensorLib first uses Space-Time Transformation to explore different dataflows, which can compactly represent the hardware dataflow using a simple transformation matrix. Next, we identify the common structures of different dataflows and build parameterized hardware module templates with Chisel. Our generation framework can select the needed hardware modules for each dataflow, connect the modules using a specified interconnection pattern, and automatically generate the complete hardware accelerator design. TensorLib remarkably improves the productivity for the development and optimization of spatial hardware architecture, providing a rich design space with trade-offs in performance, area, and power. Experiments show that TensorLib can automatically generate hardware designs with different dataflows and achieve 21% performance improvement on FPGA compared to the state-of-the-arts.

READ FULL TEXT
research
04/08/2020

HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation

To speedup Deep Neural Networks (DNN) accelerator design and enable effe...
research
05/02/2022

VWA: Hardware Efficient Vectorwise Accelerator for Convolutional Neural Network

Hardware accelerators for convolution neural networks (CNNs) enable real...
research
08/24/2021

METRO: A Software-Hardware Co-Design of Interconnections for Spatial DNN Accelerators

Tiled spatial architectures have proved to be an effective solution to b...
research
07/18/2019

FBLAS: Streaming Linear Algebra on FPGA

Energy efficiency is one of the primary concerns when designing large sc...
research
04/05/2021

Meta-level issues in Offloading: Scoping, Composition, Development, and their Automation

This paper argues for an accelerator development toolchain that takes in...
research
09/03/2023

WindMill: A Parameterized and Pluggable CGRA Implemented by DIAG Design Flow

With the cross-fertilization of applications and the ever-increasing sca...
research
05/19/2018

Productively Expressing High-performance Spatial Designs of Givens Rotation-based QR Decomposition Algorithm

QR decomposition is used prevalently in wireless communication. In this ...

Please sign up or login with your details

Forgot password? Click here to reset