ArrayFlex: A Systolic Array Architecture with Configurable Transparent Pipelining

11/22/2022
by   C. Peltekis, et al.
0

Convolutional Neural Networks (CNNs) are the state-of-the-art solution for many deep learning applications. For maximum scalability, their computation should combine high performance and energy efficiency. In practice, the convolutions of each CNN layer are mapped to a matrix multiplication that includes all input features and kernels of each layer and is computed using a systolic array. In this work, we focus on the design of a systolic array with configurable pipeline with the goal to select an optimal pipeline configuration for each CNN layer. The proposed systolic array, called ArrayFlex, can operate in normal, or in shallow pipeline mode, thus balancing the execution time in cycles and the operating clock frequency. By selecting the appropriate pipeline configuration per CNN layer, ArrayFlex reduces the inference latency of state-of-the-art CNNs by 11 fixed-pipeline systolic array. Most importantly, this result is achieved while using 13 energy-delay-product efficiency between 1.4x and 1.8x.

READ FULL TEXT

page 5

page 6

research
12/21/2021

VW-SDK: Efficient Convolutional Weight Mapping Using Variable Windows for Processing-In-Memory Architectures

With their high energy efficiency, processing-in-memory (PIM) arrays are...
research
07/28/2021

SPOTS: An Accelerator for Sparse Convolutional Networks Leveraging Systolic General Matrix-Matrix Multiplication

This paper proposes a new hardware accelerator for sparse convolutional ...
research
05/23/2017

SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks

Convolutional Neural Networks (CNNs) have emerged as a fundamental techn...
research
09/06/2023

The Case for Asymmetric Systolic Array Floorplanning

The widespread proliferation of deep learning applications has triggered...
research
06/08/2019

5 Parallel Prism: A topology for pipelined implementations of convolutional neural networks using computational memory

In-memory computing is an emerging computing paradigm that could enable ...
research
07/29/2023

Recent neutrino oscillation result with the IceCube experiment

The IceCube South Pole Neutrino Observatory is a Cherenkov detector inst...
research
02/04/2022

EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators

Dilated and transposed convolutions are widely used in modern convolutio...

Please sign up or login with your details

Forgot password? Click here to reset