Capstan: A Vector RDA for Sparsity

04/26/2021
by   Alexander Rucker, et al.
0

This paper proposes Capstan: a scalable, parallel-patterns-based, reconfigurable dataflow accelerator (RDA) for sparse and dense tensor applications. Instead of designing for one application, we start with common sparse data formats, each of which supports multiple applications. Using a declarative programming model, Capstan supports application-independent sparse iteration and memory primitives that can be mapped to vectorized, high-performance hardware. We optimize random-access sparse memories with configurable out-of-order execution to increase SRAM random-access throughput from 32 For a variety of sparse applications, Capstan with DDR4 memory is 18x faster than a multi-core CPU baseline, while Capstan with HBM2 memory is 16x faster than an Nvidia V100 GPU. For sparse applications that can be mapped to Plasticine, a recent dense RDA, Capstan is 7.6x to 365x faster and only 16 larger.

READ FULL TEXT

page 2

page 4

page 7

page 8

page 9

page 11

page 12

page 13

research
05/09/2023

Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra

Sparse linear algebra is crucial in many application domains, but challe...
research
04/17/2023

Dynamically Reconfigurable Variable-precision Sparse-Dense Matrix Acceleration in Tensorflow Lite

In this paper, we present a dynamically reconfigurable hardware accelera...
research
11/09/2018

Spatter: A Benchmark Suite for Evaluating Sparse Access Patterns

Recent characterizations of data movement performance have evaluated opt...
research
09/18/2021

Reconfigurable Low-latency Memory System for Sparse Matricized Tensor Times Khatri-Rao Product on FPGA

Tensor decomposition has become an essential tool in many applications i...
research
08/09/2021

Preparing for Performance Analysis at Exascale

Performance tools for emerging heterogeneous exascale platforms must add...
research
11/20/2016

Deep Tensor Convolution on Multicores

Deep convolutional neural networks (ConvNets) of 3-dimensional kernels a...
research
06/23/2021

Weighted Random Sampling on GPUs

An alias table is a data structure that allows for efficiently drawing w...

Please sign up or login with your details

Forgot password? Click here to reset