Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

05/15/2019
by   Corey Lammie, et al.
0

Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated using both GPUs and FPGAs. Our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art performance, while offering a >16-fold improvement in power consumption, compared to conventional GPU-accelerated networks. Both our FPGA-accelerated determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10 by >9.89x and >9.91x, respectively.

READ FULL TEXT
research
02/04/2016

FPGA Based Implementation of Deep Neural Networks Using On-chip Memory Only

Deep neural networks (DNNs) demand a very large amount of computation an...
research
01/08/2020

Training Progressively Binarizing Deep Networks Using FPGAs

While hardware implementations of inference routines for Binarized Neura...
research
11/14/2019

An Efficient Hardware-Oriented Dropout Algorithm

This paper proposes a hardware-oriented dropout algorithm, which is effi...
research
12/05/2019

PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones

Over the last years, a great success of deep neural networks (DNNs) has ...
research
06/21/2021

An Efficient SDN Architecture for Smart Home Security Accelerated by FPGA

With the rise in Internet of Things (IoT) devices, home network manageme...
research
06/08/2020

Design Challenges of Neural Network Acceleration Using Stochastic Computing

The enormous and ever-increasing complexity of state-of-the-art neural n...
research
08/11/2021

ProAI: An Efficient Embedded AI Hardware for Automotive Applications - a Benchmark Study

Development in the field of Single Board Computers (SBC) have been incre...

Please sign up or login with your details

Forgot password? Click here to reset