Weightless Neural Networks for Efficient Edge Inference

by   Zachary Susskind, et al.

Weightless Neural Networks (WNNs) are a class of machine learning model which use table lookups to perform inference. This is in contrast with Deep Neural Networks (DNNs), which use multiply-accumulate operations. State-of-the-art WNN architectures have a fraction of the implementation cost of DNNs, but still lag behind them on accuracy for common image recognition tasks. Additionally, many existing WNN architectures suffer from high memory requirements. In this paper, we propose a novel WNN architecture, BTHOWeN, with key algorithmic and architectural improvements over prior work, namely counting Bloom filters, hardware-friendly hashing, and Gaussian-based nonlinear thermometer encodings to improve model accuracy and reduce area and energy consumption. BTHOWeN targets the large and growing edge computing sector by providing superior latency and energy efficiency to comparable quantized DNNs. Compared to state-of-the-art WNNs across nine classification datasets, BTHOWeN on average reduces error by more than than 40 demonstrate the viability of the BTHOWeN architecture by presenting an FPGA-based accelerator, and compare its latency and resource usage against similarly accurate quantized DNN accelerators, including Multi-Layer Perceptron (MLP) and convolutional models. The proposed BTHOWeN models consume almost 80 less energy than the MLP models, with nearly 85 quest for efficient ML on the edge, WNNs are clearly deserving of additional attention.


page 4

page 10

page 11


ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks

The deployment of AI models on low-power, real-time edge devices require...

Impact of On-Chip Interconnect on In-Memory Acceleration of Deep Neural Networks

With the widespread use of Deep Neural Networks (DNNs), machine learning...

A mixed signal architecture for convolutional neural networks

Deep neural network (DNN) accelerators with improved energy and delay ar...

Seculator: A Fast and Secure Neural Processing Unit

Securing deep neural networks (DNNs) is a problem of significant interes...

Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism

There has been a rapid advance of custom hardware (HW) for accelerating ...

Structured Multi-Hashing for Model Compression

Despite the success of deep neural networks (DNNs), state-of-the-art mod...

An Electro-Photonic System for Accelerating Deep Neural Networks

The number of parameters in deep neural networks (DNNs) is scaling at ab...

Please sign up or login with your details

Forgot password? Click here to reset