Real-time speech enhancement using equilibriated RNN

02/14/2020
by   Daiki Takeuchi, et al.
0

We propose a speech enhancement method using a causal deep neural network (DNN) for real-time applications. DNN has been widely used for estimating a time-frequency (T-F) mask which enhances a speech signal. One popular DNN structure for that is a recurrent neural network (RNN) owing to its capability of effectively modelling time-sequential data like speech. In particular, the long short-term memory (LSTM) is often used to alleviate the vanishing/exploding gradient problem which makes the training of an RNN difficult. However, the number of parameters of LSTM is increased as the price of mitigating the difficulty of training, which requires more computational resources. For real-time speech enhancement, it is preferable to use a smaller network without losing the performance. In this paper, we propose to use the equilibriated recurrent neural network (ERNN) for avoiding the vanishing/exploding gradient problem without increasing the number of parameters. The proposed structure is causal, which requires only the information from the past, in order to apply it in real-time. Compared to the uni- and bi-directional LSTM networks, the proposed method achieved the similar performance with much fewer parameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
10/10/2020

A Model Compression Method with Matrix Product Operators for Speech Enhancement

The deep neural network (DNN) based speech enhancement approaches have a...
research
08/28/2019

Convolutional Recurrent Neural Network Based Progressive Learning for Monaural Speech Enhancement

Recently, progressive learning has shown its capacity of improving speec...
research
10/23/2020

Dual-path Self-Attention RNN for Real-Time Speech Enhancement

We propose a dual-path self-attention recurrent neural network (DP-SARNN...
research
01/28/2020

Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement

This paper investigates several aspects of training a RNN (recurrent neu...
research
11/24/2013

A Primal-Dual Method for Training Recurrent Neural Networks Constrained by the Echo-State Property

We present an architecture of a recurrent neural network (RNN) with a fu...
research
05/21/2019

DNN-Based Multi-Frame MVDR Filtering for Single-Microphone Speech Enhancement

Multi-frame approaches for single-microphone speech enhancement, e.g., t...
research
07/02/2018

weight-importance sparse training in keyword spotting

Large size models are implemented in recently ASR system to deal with co...

Please sign up or login with your details

Forgot password? Click here to reset