NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic

04/07/2021
by   Mahdi Nazemi, et al.
0

While there is a large body of research on efficient processing of deep neural networks (DNNs), ultra-low-latency realization of these models for applications with stringent, sub-microsecond latency requirements continues to be an unresolved, challenging problem. Field-programmable gate array (FPGA)-based DNN accelerators are gaining traction as a serious contender to replace graphics processing unit/central processing unit-based platforms considering their performance, flexibility, and energy efficiency. This paper presents NullaNet Tiny, an across-the-stack design and optimization framework for constructing resource and energy-efficient, ultra-low-latency FPGA-based neural network accelerators. The key idea is to replace expensive operations required to compute various filter/neuron functions in a DNN with Boolean logic expressions that are mapped to the native look-up tables (LUTs) of the FPGA device (examples of such operations are multiply-and-accumulate and batch normalization). At about the same level of classification accuracy, compared to Xilinx's LogicNets, our design achieves 2.36× lower latency and 24.42× lower LUT utilization.

READ FULL TEXT
research
09/05/2023

PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference

Field-programmable gate arrays (FPGAs) are widely used to implement deep...
research
05/11/2021

3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration

The deep neural network (DNN) based AI applications on the edge require ...
research
04/13/2023

Algorithms and Hardware for Efficient Processing of Logic-based Neural Networks

Recent efforts to improve the performance of neural network (NN) acceler...
research
01/12/2017

Scaling Binarized Neural Networks on Reconfigurable Logic

Binarized neural networks (BNNs) are gaining interest in the deep learni...
research
12/04/2021

Logic Shrinkage: Learned FPGA Netlist Sparsity for Efficient Neural Network Inference

FPGA-specific DNN architectures using the native LUTs as independently t...
research
02/13/2021

Voltage Scaling for Partitioned Systolic Array in A Reconfigurable Platform

The exponential emergence of Field Programmable Gate Array (FPGA) has ac...

Please sign up or login with your details

Forgot password? Click here to reset