Algorithms and Hardware for Efficient Processing of Logic-based Neural Networks

04/13/2023
by   Jingkai Hong, et al.
0

Recent efforts to improve the performance of neural network (NN) accelerators that meet today's application requirements have given rise to a new trend of logic-based NN inference relying on fixed-function combinational logic (FFCL). This paper presents an innovative optimization methodology for compiling and mapping NNs utilizing FFCL into a logic processor. The presented method maps FFCL blocks to a set of Boolean functions where Boolean operations in each function are mapped to high-performance, low-latency, parallelized processing elements. Graph partitioning and scheduling algorithms are presented to handle FFCL blocks that cannot straightforwardly fit the logic processor. Our experimental evaluations across several datasets and NNs demonstrate the superior performance of our framework in terms of the inference throughput compared to prior art NN accelerators. We achieve 25x higher throughput compared with the XNOR-based accelerator for VGG16 model that can be amplified 5x deploying the graph partitioning and merging algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/08/2019

TMA: Tera-MACs/W Neural Hardware Inference Accelerator with a Multiplier-less Massive Parallel Processor

Computationally intensive Inference tasks of Deep neural networks have e...
research
04/07/2021

NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic

While there is a large body of research on efficient processing of deep ...
research
01/28/2019

FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture

Neural Network (NN) accelerators with emerging ReRAM (resistive random a...
research
11/10/2022

NEON: Enabling Efficient Support for Nonlinear Operations in Resistive RAM-based Neural Network Accelerators

Resistive Random-Access Memory (RRAM) is well-suited to accelerate neura...
research
04/17/2023

Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU

Dedicated accelerator hardware has become essential for processing AI-ba...
research
07/06/2016

A configurable accelerator for manycores: the Explicitly Many-Processor Approach

A new approach to designing processor accelerators is presented. A new c...

Please sign up or login with your details

Forgot password? Click here to reset