CNN2Gate: Toward Designing a General Framework for Implementation of Convolutional Neural Networks on FPGA

04/06/2020
by   Alireza Ghaffari, et al.
0

Convolutional Neural Networks (CNNs) have a major impact on our society because of the numerous services they provide. On the other hand, they require considerable computing power. To satisfy these requirements, it is possible to use graphic processing units (GPUs). However, high power consumption and limited external IOs constrain their usability and suitability in industrial and mission-critical scenarios. Recently, the number of researches that utilize FPGAs to implement CNNs are increasing rapidly. This is due to the lower power consumption and easy reconfigurability offered by these platforms. Because of the research efforts put into topics such as architecture, synthesis and optimization, some new challenges are arising to integrate such hardware solutions to high-level machine learning software libraries. This paper introduces an integrated framework (CNN2Gate) that supports compilation of a CNN model for an FPGA target. CNN2Gate exploits the OpenCLTM synthesis workflow for FPGAs offered by commercial vendors. CNN2Gate is capable of parsing CNN models from several popular high-level machine learning libraries such as Keras, Pytorch, Caffe2 etc. CNN2Gate extracts computation flow of layers, in addition to weights and biases and applies a "given" fixed-point quantization. Furthermore, it writes this information in the proper format for OpenCL synthesis tools that are then used to build and run the project on FPGA. CNN2Gate performs design-space exploration using a reinforcement learning agent and fits the design on different FPGAs with limited logic resources automatically. This paper reports results of automatic synthesis and design-space exploration of AlexNet and VGG-16 on various Intel FPGA platforms. CNN2Gate achieves a latency of 205 ms for VGG-16 and 18 ms for AlexNet on the FPGA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2020

HL-Pow: A Learning-Based Power Modeling Framework for High-Level Synthesis

High-level synthesis (HLS) enables designers to customize hardware desig...
research
03/15/2018

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

In the past decade, Convolutional Neural Networks (CNNs) have demonstrat...
research
05/08/2015

FPGA-Based Bandwidth Selection for Kernel Density Estimation Using High Level Synthesis Approach

FPGA technology can offer significantly higher performance at much lower...
research
04/27/2020

A scalable and efficient convolutional neural network accelerator using HLS for a System on Chip design

This paper presents a configurable Convolutional Neural Network Accelera...
research
01/30/2018

Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing

Convolutional neural networks (CNNs) are one of the most successful mach...
research
07/18/2020

DeepDive: An Integrative Algorithm/Architecture Co-Design for Deep Separable Convolutional Neural Networks

Deep Separable Convolutional Neural Networks (DSCNNs) have become the em...
research
03/23/2018

Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs

Deep Convolutional Neural Networks have become a Swiss knife in solving ...

Please sign up or login with your details

Forgot password? Click here to reset