FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks

09/12/2018
by   Michaela Blott, et al.
0

Convolutional Neural Networks have rapidly become the most successful machine learning algorithm, enabling ubiquitous machine vision and intelligent decisions on even embedded computing-systems. While the underlying arithmetic is structurally simple, compute and memory requirements are challenging. One of the promising opportunities is leveraging reduced-precision representations for inputs, activations and model parameters. The resulting scalability in performance, power efficiency and storage footprint provides interesting design compromises in exchange for a small reduction in accuracy. FPGAs are ideal for exploiting low-precision inference engines leveraging custom precisions to achieve the required numerical accuracy for a given application. In this article, we describe the second generation of the FINN framework, an end-to-end tool which enables design space exploration and automates the creation of fully customized inference engines on FPGAs. Given a neural network description, the tool optimizes for given platforms, design targets and a specific precision. We introduce formalizations of resource cost functions and performance predictions, and elaborate on the optimization algorithms. Finally, we evaluate a selection of reduced precision neural networks ranging from CIFAR-10 classifiers to YOLO-based object detection on a range of platforms including PYNQ and AWS F1, demonstrating new unprecedented measured throughput at 50TOp/s on AWS-F1 and 5TOp/s on embedded devices.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2019

TinBiNN: Tiny Binarized Neural Network Overlay in about 5,000 4-LUTs and 5mW

Reduced-precision arithmetic improves the size, cost, power and performa...
research
05/27/2021

Quantization and Deployment of Deep Neural Networks on Microcontrollers

Embedding Artificial Intelligence onto low-power devices is a challengin...
research
09/04/2017

WRPN: Wide Reduced-Precision Networks

For computer vision applications, prior works have shown the efficacy of...
research
06/15/2022

Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks

The large computing and memory cost of deep neural networks (DNNs) often...
research
02/27/2021

Efficient Soft-Error Detection for Low-precision Deep Learning Recommendation Models

Soft error, namely silent corruption of signal or datum in a computer sy...
research
06/21/2018

Inference of Quantized Neural Networks on Heterogeneous All-Programmable Devices

Neural networks have established as a generic and powerful means to appr...
research
11/23/2017

fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs

In recent years, Convolutional Neural Networks (ConvNets) have become an...

Please sign up or login with your details

Forgot password? Click here to reset