A scalable and efficient convolutional neural network accelerator using HLS for a System on Chip design

04/27/2020
by   Kim Bjerge, et al.
0

This paper presents a configurable Convolutional Neural Network Accelerator (CNNA) for a System on Chip design (SoC). The goal was to accelerate inference of different deep learning networks on an embedded SoC platform. The presented CNNA has a scalable architecture which uses High Level Synthesis (HLS) and SystemC for the hardware accelerator. It is able to accelerate any Convolutional Neural Network (CNN) exported from Python and supports a combination of convolutional, max-pooling, and fully connected layers. A training method with fixed-point quantized weights is proposed and presented in the paper. The CNNA is template-based, enabling it to scale for different targets of the Xilinx Zynq platform. This approach enables design space exploration, which makes it possible to explore several configurations of the CNNA during C- and RTL-simulation, fitting it to the desired platform and model. The CNN VGG16 was used to test the solution on a Xilinx Ultra96 board using PYNQ. The result gave a high level of accuracy in training with an auto-scaled fixed-point Q2.14 format compared to a similar floating-point model. It was able to perform inference in 2.0 seconds, while having an average power consumption of 2.63 W, which corresponds to a power efficiency of 6.0 GOPS/W.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2016

Hardware-oriented Approximation of Convolutional Neural Networks

High computational complexity hinders the widespread usage of Convolutio...
research
07/21/2022

Hardware-Efficient Template-Based Deep CNNs Accelerator Design

Acceleration of Convolutional Neural Network (CNN) on edge devices has r...
research
12/01/2022

TCN-CUTIE: A 1036 TOp/s/W, 2.72 uJ/Inference, 12.2 mW All-Digital Ternary Accelerator in 22 nm FDX Technology

Tiny Machine Learning (TinyML) applications impose uJ/Inference constrai...
research
12/16/2019

A flexible FPGA accelerator for convolutional neural networks

Though CNNs are highly parallel workloads, in the absence of efficient o...
research
04/06/2020

CNN2Gate: Toward Designing a General Framework for Implementation of Convolutional Neural Networks on FPGA

Convolutional Neural Networks (CNNs) have a major impact on our society ...
research
06/23/2023

FPGA Implementation of Convolutional Neural Network for Real-Time Handwriting Recognition

Machine Learning (ML) has recently been a skyrocketing field in Computer...
research
02/19/2021

BPLight-CNN: A Photonics-based Backpropagation Accelerator for Deep Learning

Training deep learning networks involves continuous weight updates acros...

Please sign up or login with your details

Forgot password? Click here to reset