An Energy-Efficient FPGA-based Deconvolutional Neural Networks Accelerator for Single Image Super-Resolution

01/18/2018
by   Jung-Woo Chang, et al.
0

Convolutional neural networks (CNNs) demonstrate excellent performance as compared to conventional machine learning algorithms in various computer vision applications. In recent years, FPGA-based CNN accelerators have been proposed for optimizing performance and power efficiency. Most accelerators are designed for object detection and recognition algorithms that are performed on low-resolution (LR) images. However, image super-resolution (SR) cannot be implemented in real time based on the typical accelerator because of the long execution cycles required to generate high-resolution (HR) images, such as those used in ultra-high-definition (UHD) systems. In this paper, we propose a novel CNN accelerator with efficient parallelization methods for SR applications. First, we propose a new methodology for optimizing the deconvolutional neural networks (DCNNs) used for increasing feature maps, based on trained filters. Second, we propose a novel method to optimize the CNN dataflow using on-chip memory so that the SR algorithm can be driven at low power in display applications. Third, we propose a two-stage quantization algorithm to determine the optimized hardware size for a limited number of DSPs. Finally, we present an energy-efficient architecture for SR and validate our architecture on a mobile panel with quad-high-definition (QHD) resolution. Our experimental results show that, with the same hardware resources, the proposed DCNN accelerator achieves a throughput up to 108 times greater than that of the conventional DCNN accelerator. In addition, our SR system achieves an energy efficiency of 92.7 GOPS/W, 173.5 GOPS/W, and 286.8 GOPS/W when the scale factors for SR are 2, 3, and 4, respectively. Furthermore, we demonstrate that our system can restore HR images with a higher peak signal-to-noise-ratio (PSNR) than conventional SR systems.

READ FULL TEXT

page 1

page 4

page 5

page 9

page 13

research
01/18/2018

ECA: Energy-Efficient FPGA-based Convolutional Neural Networks Architecture for Single Image Super-Resolution

Convolutional neural networks (CNN) show the excellent performance compa...
research
01/18/2018

On-Chip CNN Accelerator for Image Super-Resolution

To implement convolutional neural networks (CNN) in hardware, the state-...
research
09/16/2016

Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network

Recently, several models based on deep neural networks have achieved gre...
research
05/19/2021

Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA

Deep convolutional neural networks have achieved remarkable progress in ...
research
09/03/2021

SMART: A Heterogeneous Scratchpad Memory Architecture for Superconductor SFQ-based Systolic CNN Accelerators

Ultra-fast & low-power superconductor single-flux-quantum (SFQ)-based CN...
research
01/03/2021

Silicon Photonic Microring Based Chip-Scale Accelerator for Delayed Feedback Reservoir Computing

To perform temporal and sequential machine learning tasks, the use of co...
research
04/19/2021

RingCNN: Exploiting Algebraically-Sparse Ring Tensors for Energy-Efficient CNN-Based Computational Imaging

In the era of artificial intelligence, convolutional neural networks (CN...

Please sign up or login with your details

Forgot password? Click here to reset