REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs

09/29/2019
by   Caiwen Ding, et al.
0

Deep neural networks (DNNs), as the basis of object detection, will play a key role in the development of future autonomous systems with full autonomy. The autonomous systems have special requirements of real-time, energy-efficient implementations of DNNs on a power-constrained system. Two research thrusts are dedicated to performance and energy efficiency enhancement of the inference phase of DNNs. The first one is model compression techniques while the second is efficient hardware implementation. Recent works on extremely-low-bit CNNs such as the binary neural network (BNN) and XNOR-Net replace the traditional floating-point operations with binary bit operations which significantly reduces the memory bandwidth and storage requirement. However, it suffers from non-negligible accuracy loss and underutilized digital signal processing (DSP) blocks of FPGAs. To overcome these limitations, this paper proposes REQ-YOLO, a resource-aware, systematic weight quantization framework for object detection, considering both algorithm and hardware resource aspects in object detection. We adopt the block-circulant matrix method and propose a heterogeneous weight quantization using the Alternating Direction Method of Multipliers (ADMM), an effective optimization technique for general, non-convex optimization problems. To achieve real-time, highly-efficient implementations on FPGA, we present the detailed hardware implementation of block circulant matrices on CONV layers and develop an efficient processing element (PE) structure supporting the heterogeneous weight quantization, CONV dataflow and pipelining techniques, design optimization, and a template-based automatic synthesis framework to optimally exploit hardware resource. Experimental results show that our proposed REQ-YOLO framework can significantly compress the YOLO model while introducing very small accuracy degradation.

READ FULL TEXT
research
12/31/2018

ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Method of Multipliers

To facilitate efficient embedded and hardware implementations of deep ne...
research
12/12/2018

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs

Recurrent Neural Networks (RNNs) are becoming increasingly important for...
research
11/05/2018

A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM

Many model compression techniques of Deep Neural Networks (DNNs) have be...
research
04/23/2023

A Framework for Benchmarking Real-Time Embedded Object Detection

Object detection is one of the key tasks in many applications of compute...
research
02/28/2023

At-Scale Evaluation of Weight Clustering to Enable Energy-Efficient Object Detection

Accelerators implementing Deep Neural Networks for image-based object de...
research
02/09/2022

Lightweight Jet Reconstruction and Identification as an Object Detection Task

We apply object detection techniques based on deep convolutional blocks ...
research
07/24/2017

Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM

Although deep learning models are highly effective for various learning ...

Please sign up or login with your details

Forgot password? Click here to reset