Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

11/01/2019
by   Niccoló Nicodemo, et al.
0

Effective employment of deep neural networks (DNNs) in mobile devices and embedded systems is hampered by requirements for memory and computational power. This paper presents a non-uniform quantization approach which allows for dynamic quantization of DNN parameters for different layers and within the same layer. A virtual bit shift (VBS) scheme is also proposed to improve the accuracy of the proposed scheme. Our method reduces the memory requirements, preserving the performance of the network. The performance of our method is validated in a speech enhancement application, where a fully connected DNN is used to predict the clean speech spectrum from the input noisy speech spectrum. A DNN is optimized and its memory footprint and performance are evaluated using the short-time objective intelligibility, STOI, metric. The application of the low-bit quantization allows a 50 the STOI performance drops only by 2.7

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2021

DNN Quantization with Attention

Low-bit quantization of network weights and activations can drastically ...
research
07/09/2019

A Targeted Acceleration and Compression Framework for Low bit Neural Networks

1 bit deep neural networks (DNNs), of which both the activations and wei...
research
09/09/2021

ECQ^x: Explainability-Driven Quantization for Low-Bit and Sparse DNNs

The remarkable success of deep neural networks (DNNs) in various applica...
research
10/14/2022

Accelerating RNN-based Speech Enhancement on a Multi-Core MCU with Mixed FP16-INT8 Post-Training Quantization

This paper presents an optimized methodology to design and deploy Speech...
research
07/15/2022

Low-bit Shift Network for End-to-End Spoken Language Understanding

Deep neural networks (DNN) have achieved impressive success in multiple ...
research
05/16/2016

Reducing the Model Order of Deep Neural Networks Using Information Theory

Deep neural networks are typically represented by a much larger number o...
research
09/09/2018

A novel method of speech information hiding based on 3D-Magic Matrix

Redundant information of low-bit-rate speech is extremely small, thus it...

Please sign up or login with your details

Forgot password? Click here to reset