DeepAI AI Chat
Log In Sign Up

CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks

by   Alexandros Kouris, et al.
Imperial College London

This work presents CascadeCNN, an automated toolflow that pushes the quantisation limits of any given CNN model, aiming to perform high-throughput inference. A two-stage architecture tailored for any given CNN-FPGA pair is generated, consisting of a low- and high-precision unit in a cascade. A confidence evaluation unit is employed to identify misclassified cases from the excessively low-precision unit and forward them to the high-precision unit for re-processing. Experiments demonstrate that the proposed toolflow can achieve a performance boost up to 55 design for the same resource budget and accuracy, without the need of retraining the model or accessing the training data.


page 3

page 4


CascadeCNN: Pushing the performance limits of quantisation

This work presents CascadeCNN, an automated toolflow that pushes the qua...

NeuroMAX: A High Throughput, Multi-Threaded, Log-Based Accelerator for Convolutional Neural Networks

Convolutional neural networks (CNNs) require high throughput hardware ac...

HiKonv: High Throughput Quantized Convolution With Novel Bit-wise Management and Computation

Quantization for Convolutional Neural Network (CNN) has shown significan...

Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA

Convolutional Neural Networks (CNNs) reach high accuracies in various ap...

RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation

In recent years, convolutional neural network has gained popularity in m...

Unit Scaling: Out-of-the-Box Low-Precision Training

We present unit scaling, a paradigm for designing deep learning models t...

Precision Machine Learning

We explore unique considerations involved in fitting ML models to data w...