DLAU: A Scalable Deep Learning Accelerator Unit on FPGA

05/23/2016
by   Chao Wang, et al.
0

As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses significant challenge to construct a high performance implementations of deep learning neural networks. In order to improve the performance as well to maintain the low power cost, in this paper we design DLAU, which is a scalable accelerator architecture for large-scale deep learning networks using FPGA as the hardware prototype. The DLAU accelerator employs three pipelined processing units to improve the throughput and utilizes tile techniques to explore locality for deep learning applications. Experimental results on the state-of-the-art Xilinx FPGA board demonstrate that the DLAU accelerator is able to achieve up to 36.1x speedup comparing to the Intel Core2 processors, with the power consumption at 234mW.

READ FULL TEXT
research
04/08/2020

HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation

To speedup Deep Neural Networks (DNN) accelerator design and enable effe...
research
11/04/2022

An Efficient FPGA-based Accelerator for Deep Forest

Deep Forest is a prominent machine learning algorithm known for its high...
research
10/26/2017

The implementation of a Deep Recurrent Neural Network Language Model on a Xilinx FPGA

Recently, FPGA has been increasingly applied to problems such as speech ...
research
02/10/2020

Smartphone Impostor Detection with Built-in Sensors and Deep Learning

In this paper, we show that sensor-based impostor detection with deep le...
research
02/08/2019

Software-Defined FPGA Accelerator Design for Mobile Deep Learning Applications

Recently, the field of deep learning has received great attention by the...
research
06/19/2023

Co-design Hardware and Algorithm for Vector Search

Vector search has emerged as the foundation for large-scale information ...
research
05/04/2023

A Quantitative Analysis and Guideline of Data Streaming Accelerator in Intel 4th Gen Xeon Scalable Processors

As semiconductor power density is no longer constant with the technology...

Please sign up or login with your details

Forgot password? Click here to reset