At-Scale Evaluation of Weight Clustering to Enable Energy-Efficient Object Detection

02/28/2023
by   Martí Caro, et al.
0

Accelerators implementing Deep Neural Networks for image-based object detection operate on large volumes of data due to fetching images and neural network parameters, especially if they need to process video streams, hence with high power dissipation and bandwidth requirements to fetch all those data. While some solutions exist to mitigate power and bandwidth demands for data fetching, they are often assessed in the context of limited evaluations with a scale much smaller than that of the target application, which challenges finding the best tradeoff in practice. This paper sets up the infrastructure to assess at-scale a key power and bandwidth optimization - weight clustering - for You Only Look Once v3 (YOLOv3), a neural network-based object detection system, using videos of real driving conditions. Our assessment shows that accelerators such as systolic arrays with an Output Stationary architecture turn out to be a highly effective solution combined with weight clustering. In particular, applying weight clustering independently per neural network layer, and using between 32 (5-bit) and 256 (8-bit) weights allows achieving an accuracy close to that of the original YOLOv3 weights (32-bit weights). Such bit-count reduction of the weights allows shaving bandwidth requirements down to 30 45 operations is much smaller than DRAM data fetching, and (ii) designing accelerators appropriately may make that most of the data fetched corresponds to neural network weights, where clustering can be applied. Overall, our at-scale assessment provides key results to architect camera-based object detection accelerators by putting together a real-life application (YOLOv3), and real driving videos, in a unified setup so that trends observed are reliable.

READ FULL TEXT

page 6

page 8

page 15

research
05/22/2023

TinyissimoYOLO: A Quantized, Low-Memory Footprint, TinyML Object Detection Network for Low Power Microcontrollers

This paper introduces a highly flexible, quantized, memory-efficient, an...
research
06/24/2020

Bit Error Robustness for Energy-Efficient DNN Accelerators

Deep neural network (DNN) accelerators received considerable attention i...
research
04/17/2018

Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving

Autonomous driving has harsh requirements of small model size and energy...
research
09/29/2019

REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs

Deep neural networks (DNNs), as the basis of object detection, will play...
research
11/30/2020

Dataflow-Architecture Co-Design for 2.5D DNN Accelerators using Wireless Network-on-Package

Deep neural network (DNN) models continue to grow in size and complexity...
research
12/10/2020

A MAC-less Neural Inference Processor Supporting Compressed, Variable Precision Weights

This paper introduces two architectures for the inference of convolution...
research
09/01/2020

Survey of Machine Learning Accelerators

New machine learning accelerators are being announced and released each ...

Please sign up or login with your details

Forgot password? Click here to reset