Flip: Data-Centric Edge CGRA Accelerator

09/19/2023
by   Dan Wu, et al.
0

Coarse-Grained Reconfigurable Arrays (CGRA) are promising edge accelerators due to the outstanding balance in flexibility, performance, and energy efficiency. Classic CGRAs statically map compute operations onto the processing elements (PE) and route the data dependencies among the operations through the Network-on-Chip. However, CGRAs are designed for fine-grained static instruction-level parallelism and struggle to accelerate applications with dynamic and irregular data-level parallelism, such as graph processing. To address this limitation, we present Flip, a novel accelerator that enhances traditional CGRA architectures to boost the performance of graph applications. Flip retains the classic CGRA execution model while introducing a special data-centric mode for efficient graph processing. Specifically, it exploits the natural data parallelism of graph algorithms by mapping graph vertices onto processing elements (PEs) rather than the operations, and supporting dynamic routing of temporary data according to the runtime evolution of the graph frontier. Experimental results demonstrate that Flip achieves up to 36× speedup with merely 19 state-of-the-art large-scale graph processors, Flip has similar energy efficiency and 2.2× better area efficiency at a much-reduced power/area budget.

READ FULL TEXT

page 3

page 5

page 14

page 16

page 20

research
10/03/2018

Sparse Winograd Convolutional neural networks on small-scale systolic arrays

The reconfigurability, energy-efficiency, and massive parallelism on FPG...
research
04/15/2021

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems

Simple graph algorithms such as PageRank have been the target of numerou...
research
12/19/2022

A Soft SIMD Based Energy Efficient Computing Microarchitecture

The ever-increasing size and computational complexity of today's machine...
research
07/06/2021

Energy-Efficient Accelerator Design for Deformable Convolution Networks

Deformable convolution networks (DCNs) proposed to address the image rec...
research
02/18/2022

Uniting Control and Data Parallelism: Towards Scalable Memory-Driven Dynamic Graph Processing

Control parallelism and data parallelism is mostly reasoned and optimize...
research
06/03/2018

An Efficient Graph Accelerator with Parallel Data Conflict Management

Graph-specific computing with the support of dedicated accelerator has g...
research
11/06/2020

Mapping Stencils on Coarse-grained Reconfigurable Spatial Architecture

Stencils represent a class of computational patterns where an output gri...

Please sign up or login with your details

Forgot password? Click here to reset