A Data-Center FPGA Acceleration Platform for Convolutional Neural Networks

09/17/2019
by   Xiaoyu Yu, et al.
10

Intensive computation is entering data centers with multiple workloads of deep learning. To balance the compute efficiency, performance, and total cost of ownership (TCO), the use of a field-programmable gate array (FPGA) with reconfigurable logic provides an acceptable acceleration capacity and is compatible with diverse computation-sensitive tasks in the cloud. In this paper, we develop an FPGA acceleration platform that leverages a unified framework architecture for general-purpose convolutional neural network (CNN) inference acceleration at a data center. To overcome the computation bound, 4,096 DSPs are assembled and shaped as supertile units (SUs) for different types of convolution, which provide up to 4.2 TOP/s 16-bit fixed-point performance at 500 MHz. The interleaved-task-dispatching method is proposed to map the computation across the SUs, and the memory bound is solved by a dispatching-assembling buffering model and broadcast caches. For various non-convolution operators, a filter processing unit is designed for general-purpose filter-like/pointwise operators. In the experiment, the performances of CNN models running on server-class CPUs, a GPU, and an FPGA are compared. The results show that our design achieves the best FPGA peak performance and a throughput at the same level as that of the state-of-the-art GPU in data centers, with more than 50 times lower latency.

READ FULL TEXT

page 1

page 5

page 6

research
02/02/2021

Why is FPGA-GPU Heterogeneity the Best Option for Embedded Deep Neural Networks?

Graphics Processing Units (GPUs) are currently the dominating programmab...
research
07/16/2020

FTRANS: Energy-Efficient Acceleration of Transformers using FPGA

In natural language processing (NLP), the "Transformer" architecture was...
research
11/07/2020

Strawberry Detection Using a Heterogeneous Multi-Processor Platform

Over the last few years, the number of precision farming projects has in...
research
12/23/2020

Overview of FPGA deep learning acceleration based on convolutional neural network

In recent years, deep learning has become more and more mature, and as a...
research
11/17/2020

FPGA deep learning acceleration based on convolutional neural network

In view of the large amount of calculation and long calculation time of ...
research
05/30/2018

Harmonic-summing Module of SKA on FPGA--Optimising the Irregular Memory Accesses

The Square Kilometre Array (SKA), which will be the world's largest radi...
research
02/09/2020

FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

Autoregressive convolutional neural networks (CNNs) have been widely exp...

Please sign up or login with your details

Forgot password? Click here to reset