DeepAI AI Chat
Log In Sign Up

CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks

07/13/2018
by   Alexandros Kouris, et al.
Imperial College London
0

This work presents CascadeCNN, an automated toolflow that pushes the quantisation limits of any given CNN model, aiming to perform high-throughput inference. A two-stage architecture tailored for any given CNN-FPGA pair is generated, consisting of a low- and high-precision unit in a cascade. A confidence evaluation unit is employed to identify misclassified cases from the excessively low-precision unit and forward them to the high-precision unit for re-processing. Experiments demonstrate that the proposed toolflow can achieve a performance boost up to 55 design for the same resource budget and accuracy, without the need of retraining the model or accessing the training data.

READ FULL TEXT

page 3

page 4

05/22/2018

CascadeCNN: Pushing the performance limits of quantisation

This work presents CascadeCNN, an automated toolflow that pushes the qua...
07/19/2020

NeuroMAX: A High Throughput, Multi-Threaded, Log-Based Accelerator for Convolutional Neural Networks

Convolutional neural networks (CNNs) require high throughput hardware ac...
12/28/2021

HiKonv: High Throughput Quantized Convolution With Novel Bit-wise Management and Computation

Quantization for Convolutional Neural Network (CNN) has shown significan...
08/09/2022

Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA

Convolutional Neural Networks (CNNs) reach high accuracies in various ap...
06/13/2020

RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation

In recent years, convolutional neural network has gained popularity in m...
03/20/2023

Unit Scaling: Out-of-the-Box Low-Precision Training

We present unit scaling, a paradigm for designing deep learning models t...
10/24/2022

Precision Machine Learning

We explore unique considerations involved in fitting ML models to data w...