Low-bit Shift Network for End-to-End Spoken Language Understanding

07/15/2022
by   Anderson R. Avila, et al.
0

Deep neural networks (DNN) have achieved impressive success in multiple domains. Over the years, the accuracy of these models has increased with the proliferation of deeper and more complex architectures. Thus, state-of-the-art solutions are often computationally expensive, which makes them unfit to be deployed on edge computing platforms. In order to mitigate the high computation, memory, and power requirements of inferring convolutional neural networks (CNNs), we propose the use of power-of-two quantization, which quantizes continuous parameters into low-bit power-of-two values. This reduces computational complexity by removing expensive multiplication operations and with the use of low-bit weights. ResNet is adopted as the building block of our solution and the proposed model is evaluated on a spoken language understanding (SLU) task. Experimental results show improved performance for shift neural network architectures, with our low-bit quantization achieving 98.76 % on the test set which is comparable performance to its full-precision counterpart and state-of-the-art solutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2021

S^3: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

Shift neural networks reduce computation complexity by removing expensiv...
research
11/01/2019

Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

Effective employment of deep neural networks (DNNs) in mobile devices an...
research
08/20/2022

DenseShift: Towards Accurate and Transferable Low-Bit Shift Network

Deploying deep neural networks on low-resource edge devices is challengi...
research
03/22/2021

n-hot: Efficient bit-level sparsity for powers-of-two neural network quantization

Powers-of-two (PoT) quantization reduces the number of bit operations of...
research
02/17/2020

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations

We propose precision gating (PG), an end-to-end trainable dynamic dual-p...
research
02/24/2022

Standard Deviation-Based Quantization for Deep Neural Networks

Quantization of deep neural networks is a promising approach that reduce...
research
08/22/2016

Lets keep it simple, Using simple architectures to outperform deeper and more complex architectures

Major winning Convolutional Neural Networks (CNNs), such as AlexNet, VGG...

Please sign up or login with your details

Forgot password? Click here to reset