Towards Efficient In-memory Computing Hardware for Quantized Neural Networks: State-of-the-art, Open Challenges and Perspectives

07/08/2023
by   Olga Krestinskaya, et al.
0

The amount of data processed in the cloud, the development of Internet-of-Things (IoT) applications, and growing data privacy concerns force the transition from cloud-based to edge-based processing. Limited energy and computational resources on edge push the transition from traditional von Neumann architectures to In-memory Computing (IMC), especially for machine learning and neural network applications. Network compression techniques are applied to implement a neural network on limited hardware resources. Quantization is one of the most efficient network compression techniques allowing to reduce the memory footprint, latency, and energy consumption. This paper provides a comprehensive review of IMC-based Quantized Neural Networks (QNN) and links software-based quantization approaches to IMC hardware implementation. Moreover, open challenges, QNN design requirements, recommendations, and perspectives along with an IMC-based QNN hardware roadmap are provided.

READ FULL TEXT

page 1

page 2

page 3

page 6

research
02/01/2019

Efficient Hybrid Network Architectures for Extremely Quantized Neural Networks Enabling Intelligence at the Edge

The recent advent of `Internet of Things' (IOT) has increased the demand...
research
06/17/2023

CStream: Parallel Data Stream Compression on Multicore Edge Devices

In the burgeoning realm of Internet of Things (IoT) applications on edge...
research
01/07/2020

Resource-Efficient Neural Networks for Embedded Systems

While machine learning is traditionally a resource intensive task, embed...
research
12/05/2018

Efficient and Robust Machine Learning for Real-World Systems

While machine learning is traditionally a resource intensive task, embed...
research
06/05/2019

Energy and Policy Considerations for Deep Learning in NLP

Recent progress in hardware and methodology for training neural networks...
research
11/30/2020

Robust error bounds for quantised and pruned neural networks

With the rise of smartphones and the internet-of-things, data is increas...
research
05/08/2020

Efficient Computation Reduction in Bayesian Neural Networks Through Feature Decomposition and Memorization

Bayesian method is capable of capturing real world uncertainties/incompl...

Please sign up or login with your details

Forgot password? Click here to reset