FPGA-Based CNN Inference Accelerator Synthesized from Multi-Threaded C Software

07/27/2018
by   Jin Hee Kim, et al.
0

A deep-learning inference accelerator is synthesized from a C-language software program parallelized with Pthreads. The software implementation uses the well-known producer/consumer model with parallel threads interconnected by FIFO queues. The LegUp high-level synthesis (HLS) tool synthesizes threads into parallel FPGA hardware, translating software parallelism into spatial parallelism. A complete system is generated where convolution, pooling and padding are realized in the synthesized accelerator, with remaining tasks executing on an embedded ARM processor. The accelerator incorporates reduced precision, and a novel approach for zero-weight-skipping in convolution. On a mid-sized Intel Arria 10 SoC FPGA, peak performance on VGG-16 is 138 effective GOPS.

READ FULL TEXT
research
06/03/2016

GRVI Phalanx: A Massively Parallel RISC-V FPGA Accelerator Accelerator

GRVI is an FPGA-efficient RISC-V RV32I soft processor. Phalanx is a para...
research
11/15/2019

TinyCNN: A Tiny Modular CNN Accelerator for Embedded FPGA

In recent years, Convolutional Neural Network (CNN) based methods have a...
research
10/20/2021

Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning

We present a novel characterization of the mapping of multiple paralleli...
research
07/15/2021

Arrow: A RISC-V Vector Accelerator for Machine Learning Inference

In this paper we present Arrow, a configurable hardware accelerator arch...
research
11/21/2018

Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs

Using FPGAs to accelerate ConvNets has attracted significant attention i...
research
05/06/2018

SqueezeJet: High-level Synthesis Accelerator Design for Deep Convolutional Neural Networks

Deep convolutional neural networks have dominated the pattern recognitio...
research
10/26/2018

A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform

Region proposal is critical for object detection while it usually poses ...

Please sign up or login with your details

Forgot password? Click here to reset