Towards a Uniform Architecture for the Efficient Implementation of 2D and 3D Deconvolutional Neural Networks on FPGAs

03/06/2019
by   Deguang Wang, et al.
0

Three-dimensional deconvolution is widely used in many computer vision applications. However, most previous works have only focused on accelerating 2D deconvolutional neural networks (DCNNs) on FPGAs, while the acceleration of 3D DCNNs has not been studied in depth as they have higher computational complexity and sparsity than 2D DCNNs. In this paper, we focus on the acceleration of both 2D and 3D DCNNs on FPGAs by proposing efficient schemes for mapping 2D and 3D DCNNs on a uniform architecture. By implementing our design on the Xilinx VC709 platform for four real-life 2D and 3D DCNNs, we can achieve up to 3.0 TOPS with high hardware efficiency. Comparisons with CPU and GPU solutions demonstrate that we can achieve an improvement of up to 63.3X in throughput relative to a CPU solution and an improvement of up to 8.3X in energy efficiency compared to a GPU solution.

READ FULL TEXT
research
02/20/2017

A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks

FPGA-based hardware accelerators for convolutional neural networks (CNNs...
research
02/02/2021

Why is FPGA-GPU Heterogeneity the Best Option for Embedded Deep Neural Networks?

Graphics Processing Units (GPUs) are currently the dominating programmab...
research
11/10/2021

SPA-GCN: Efficient and Flexible GCN Accelerator with an Application for Graph Similarity Computation

While there have been many studies on hardware acceleration for deep lea...
research
12/18/2020

When Machine Learning Meets Quantum Computers: A Case Study

Along with the development of AI democratization, the machine learning a...
research
07/19/2021

ZIPPER: Exploiting Tile- and Operator-level Parallelism for General and Scalable Graph Neural Network Acceleration

Graph neural networks (GNNs) start to gain momentum after showing signif...
research
01/31/2023

XCRYPT: Accelerating Lattice Based Cryptography with Memristor Crossbar Arrays

This paper makes a case for accelerating lattice-based post quantum cryp...
research
06/20/2023

A Versatility-Performance Balanced Hardware Architecture for Scene Text Detection

Detecting and extracting textual information from natural scene images n...

Please sign up or login with your details

Forgot password? Click here to reset