Inference skipping for more efficient real-time speech enhancement with parallel RNNs

07/22/2022
by   Xiaohuai Le, et al.
0

Deep neural network (DNN) based speech enhancement models have attracted extensive attention due to their promising performance. However, it is difficult to deploy a powerful DNN in real-time applications because of its high computational cost. Typical compression methods such as pruning and quantization do not make good use of the data characteristics. In this paper, we introduce the Skip-RNN strategy into speech enhancement models with parallel RNNs. The states of the RNNs update intermittently without interrupting the update of the output mask, which leads to significant reduction of computational load without evident audio artifacts. To better leverage the difference between the voice and the noise, we further regularize the skipping strategy with voice activity detection (VAD) guidance, saving more computational load. Experiments on a high-performance speech enhancement model, dual-path convolutional recurrent network (DPCRN), show the superiority of our strategy over strategies like network pruning or directly training a smaller model. We also validate the generalization of the proposed strategy on two other competitive speech enhancement models.

READ FULL TEXT

page 1

page 8

research
05/20/2020

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids

Modern speech enhancement algorithms achieve remarkable noise suppressio...
research
12/04/2017

Precision Scaling of Neural Networks for Efficient Audio Processing

While deep neural networks have shown powerful performance in many audio...
research
05/17/2020

Voice Activity Detection Scheme by Combining DNN Model with GMM Model

Due to the superior modeling ability of deep neural network (DNN), it is...
research
11/11/2020

Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning

Recurrent neural networks (RNNs) have shown significant improvements in ...
research
01/22/2021

Towards efficient models for real-time deep noise suppression

With recent research advancements, deep learning models are becoming att...
research
11/03/2021

Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators

We explore network sparsification strategies with the aim of compressing...
research
07/12/2021

DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement

The dual-path RNN (DPRNN) was proposed to more effectively model extreme...

Please sign up or login with your details

Forgot password? Click here to reset