Towards Design Space Exploration and Optimization of Fast Algorithms for Convolutional Neural Networks (CNNs) on FPGAs

03/05/2019
by   Afzal Ahmad, et al.
0

Convolutional Neural Networks (CNNs) have gained widespread popularity in the field of computer vision and image processing. Due to huge computational requirements of CNNs, dedicated hardware-based implementations are being explored to improve their performance. Hardware platforms such as Field Programmable Gate Arrays (FPGAs) are widely being used to design parallel architectures for this purpose. In this paper, we analyze Winograd minimal filtering or fast convolution algorithms to reduce the arithmetic complexity of convolutional layers of CNNs. We explore a complex design space to find the sets of parameters that result in improved throughput and power-efficiency. We also design a pipelined and parallel Winograd convolution engine that improves the throughput and power-efficiency while reducing the computational complexity of the overall system. Our proposed designs show up to 4.75× and 1.44× improvements in throughput and power-efficiency, respectively, in comparison to the state-of-the-art design while using approximately 2.67× more multipliers. Furthermore, we obtain savings of up to 53.6% in logic resources compared with the state-of-the-art implementation.

READ FULL TEXT
research
12/29/2018

Quantized Guided Pruning for Efficient Hardware Implementations of Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are state-of-the-art in numerous co...
research
12/11/2017

Multi-Mode Inference Engine for Convolutional Neural Networks

During the past few years, interest in convolutional neural networks (CN...
research
06/25/2019

A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks

Neural Networks (NNs) have become the mainstream technology in the artif...
research
09/30/2015

Fast Algorithms for Convolutional Neural Networks

Deep convolutional neural networks take GPU days of compute time to trai...
research
03/04/2019

Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs

The Winograd or Cook-Toom class of algorithms help to reduce the overall...
research
04/12/2020

Minimal Filtering Algorithms for Convolutional Neural Networks

In this paper, we present several resource-efficient algorithmic solutio...
research
10/13/2019

ERNet Family: Hardware-Oriented CNN Models for Computational Imaging Using Block-Based Inference

Convolutional neural networks (CNNs) demand huge DRAM bandwidth for comp...

Please sign up or login with your details

Forgot password? Click here to reset