An Efficient FPGA-based Accelerator for Deep Forest

11/04/2022
by   Mingyu Zhu, et al.
0

Deep Forest is a prominent machine learning algorithm known for its high accuracy in forecasting. Compared with deep neural networks, Deep Forest has almost no multiplication operations and has better performance on small datasets. However, due to the deep structure and large forest quantity, it suffers from large amounts of calculation and memory consumption. In this paper, an efficient hardware accelerator is proposed for deep forest models, which is also the first work to implement Deep Forest on FPGA. Firstly, a delicate node computing unit (NCU) is designed to improve inference speed. Secondly, based on NCU, an efficient architecture and an adaptive dataflow are proposed, in order to alleviate the problem of node computing imbalance in the classification process. Moreover, an optimized storage scheme in this design also improves hardware utilization and power efficiency. The proposed design is implemented on an FPGA board, Intel Stratix V, and it is evaluated by two typical datasets, ADULT and Face Mask Detection. The experimental results show that the proposed design can achieve around 40x speedup compared to that on a 40 cores high performance x86 CPU.

READ FULL TEXT
research
05/23/2016

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA

As the emerging field of machine learning, deep learning shows excellent...
research
12/28/2021

FPGA Based Accelerator for Neural Networks Computation with Flexible Pipelining

FPGA is appropriate for fix-point neural networks computing due to high ...
research
08/28/2022

FFCNN: Fast FPGA based Acceleration for Convolution neural network inference

We present a new efficient OpenCL-based Accelerator for large scale Conv...
research
12/15/2021

N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores

Accelerating the neural network inference by FPGA has emerged as a popul...
research
08/29/2019

High Performance Scalable FPGA Accelerator for Deep Neural Networks

Low-precision is the first order knob for achieving higher Artificial In...
research
07/11/2022

FSHMEM: Supporting Partitioned Global Address Space on FPGAs for Large-Scale Hardware Acceleration Infrastructure

By providing highly efficient one-sided communication with globally shar...
research
08/08/2023

Novel Area-Efficient and Flexible Architectures for Optimal Ate Pairing on FPGA

While FPGA is a suitable platform for implementing cryptographic algorit...

Please sign up or login with your details

Forgot password? Click here to reset