Performance-Efficiency Trade-off of Low-Precision Numerical Formats in Deep Neural Networks

03/25/2019
by   Zachariah Carmichael, et al.
0

Deep neural networks (DNNs) have been demonstrated as effective prognostic models across various domains, e.g. natural language processing, computer vision, and genomics. However, modern-day DNNs demand high compute and memory storage for executing any reasonably complex task. To optimize the inference time and alleviate the power consumption of these networks, DNN accelerators with low-precision representations of data and DNN parameters are being actively studied. An interesting research question is in how low-precision networks can be ported to edge-devices with similar performance as high-precision networks. In this work, we employ the fixed-point, floating point, and posit numerical formats at ≤8-bit precision within a DNN accelerator, Deep Positron, with exact multiply-and-accumulate (EMAC) units for inference. A unified analysis quantifies the trade-offs between overall network efficiency and performance across five classification tasks. Our results indicate that posits are a natural fit for DNN inference, outperforming at ≤8-bit precision, and can be realized with competitive resource requirements relative to those of floating point.

READ FULL TEXT

page 4

page 5

page 6

research
12/05/2018

Deep Positron: A Deep Neural Network Using the Posit Number System

The recent surge of interest in Deep Neural Networks (DNNs) has led to i...
research
08/07/2018

Rethinking Numerical Representations for Deep Neural Networks

With ever-increasing computational demand for deep learning, it is criti...
research
01/31/2023

Tricking AI chips into Simulating the Human Brain: A Detailed Performance Analysis

Challenging the Nvidia monopoly, dedicated AI-accelerator chips have beg...
research
05/03/2018

Exploration of Numerical Precision in Deep Neural Networks

Reduced numerical precision is a common technique to reduce computationa...
research
03/11/2020

Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML

We present the implementation of binary and ternary neural networks in t...
research
05/28/2018

Convolutional neural network compression for natural language processing

Convolutional neural networks are modern models that are very efficient ...
research
02/10/2020

A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning

Deep Neural Networks (DNN) represent a performance-hungry application. F...

Please sign up or login with your details

Forgot password? Click here to reset