Precision Scaling of Neural Networks for Efficient Audio Processing

12/04/2017
by   Jong Hwan Ko, et al.
0

While deep neural networks have shown powerful performance in many audio applications, their large computation and memory demand has been a challenge for real-time processing. In this paper, we study the impact of scaling the precision of neural networks on the performance of two common audio processing tasks, namely, voice-activity detection and single-channel speech enhancement. We determine the optimal pair of weight/neuron bit precision by exploring its impact on both the performance and processing time. Through experiments conducted with real user data, we demonstrate that deep neural networks that use lower bit precision significantly reduce the processing time (up to 30x). However, their performance impact is low (< 3.14 classification tasks such as those present in voice activity detection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2022

Inference skipping for more efficient real-time speech enhancement with parallel RNNs

Deep neural network (DNN) based speech enhancement models have attracted...
research
01/30/2023

The Hidden Power of Pure 16-bit Floating-Point Neural Networks

Lowering the precision of neural networks from the prevalent 32-bit prec...
research
07/22/2020

Resource-Efficient Speech Mask Estimation for Multi-Channel Speech Enhancement

While machine learning techniques are traditionally resource intensive, ...
research
08/12/2021

Deep Neural Network Voice Activity Detector for Downsampled Audio Data: An Experiment Report

Sociometric badges are an emerging technology for study how teams intera...
research
02/25/2017

Deep Voice: Real-time Neural Text-to-Speech

We present Deep Voice, a production-quality text-to-speech system constr...
research
09/05/2023

In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms

The recent ubiquitous adoption of remote conferencing has been accompani...
research
12/06/2022

BC-VAD: A Robust Bone Conduction Voice Activity Detection

Voice Activity Detection (VAD) is a fundamental module in many audio app...

Please sign up or login with your details

Forgot password? Click here to reset