FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN Model Training

04/27/2020
by   Sangkug Lym, et al.
0

Modern deep learning models have high memory and computation cost. To make them fast and memory-cost efficient, structured model pruning is commonly used. We find that pruning a model using a common training accelerator with large systolic arrays is extremely performance-inefficient. To make a systolic array efficient for pruning and training, we propose FlexSA, a flexible systolic array architecture. FlexSA dynamically reconfigures the systolic array structure and offers multiple sub-systolic operating modes, which are designed for energy- and memory bandwidth-efficient processing of tensors with different sizes and shapes. We also present a compilation heuristic for tiling matrix-multiplication-and-accumulation operations in a training workload to best utilize the resources of FlexSA. Based on our evaluation, FlexSA with the proposed compilation heuristic improves compute resource utilization of pruning and training modern CNN models by 37 accelerator with a large systolic array. FlexSA also improves on-chip data reuse by 1.7X saving 28

READ FULL TEXT

page 1

page 3

page 6

page 10

page 11

research
10/13/2020

High Area/Energy Efficiency RRAM CNN Accelerator with Kernel-Reordering Weight Mapping Scheme Based on Pattern Pruning

Resistive Random Access Memory (RRAM) is an emerging device for processi...
research
12/16/2019

A flexible FPGA accelerator for convolutional neural networks

Though CNNs are highly parallel workloads, in the absence of efficient o...
research
11/28/2021

Search for Optimal Systolic Arrays: A Comprehensive Automated Exploration Framework and Lessons Learned

Systolic arrays have been widely used for accelerating HPC and deep lear...
research
12/02/2021

Memory-efficient array redistribution through portable collective communication

Modern large-scale deep learning workloads highlight the need for parall...
research
02/11/2018

Analyzing and Mitigating the Impact of Permanent Faults on a Systolic Array Based Neural Network Accelerator

Due to their growing popularity and computational cost, deep neural netw...
research
06/24/2020

On the Difficulty of Designing Processor Arrays for Deep Neural Networks

Systolic arrays are a promising computing concept which is in particular...
research
02/26/2021

Tensors Fitting Perfectly

Multidimensional arrays (NDArrays) are a central abstraction in modern s...

Please sign up or login with your details

Forgot password? Click here to reset