DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF

07/22/2022
by   Aditya Arie Nugraha, et al.
0

This paper describes a practical dual-process speech enhancement system that adapts environment-sensitive frame-online beamforming (front-end) with help from environment-free block-online source separation (back-end). To use minimum variance distortionless response (MVDR) beamforming, one may train a deep neural network (DNN) that estimates time-frequency masks used for computing the covariance matrices of sources (speech and noise). Backpropagation-based run-time adaptation of the DNN was proposed for dealing with the mismatched training-test conditions. Instead, one may try to directly estimate the source covariance matrices with a state-of-the-art blind source separation method called fast multichannel non-negative matrix factorization (FastMNMF). In practice, however, neither the DNN nor the FastMNMF can be updated in a frame-online manner due to its computationally-expensive iterative nature. Our DNN-free system leverages the posteriors of the latest source spectrograms given by block-online FastMNMF to derive the current source covariance matrices for frame-online beamforming. The evaluation shows that our frame-online system can quickly respond to scene changes caused by interfering speaker movements and outperformed an existing block-online system with DNN-based beamforming by 5.0 points in terms of the word error rate.

READ FULL TEXT
research
07/15/2022

Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments

This paper describes the practical response- and performance-aware devel...
research
07/11/2019

Multichannel Loss Function for Supervised Speech Source Separation by Mask-based Beamforming

In this paper, we propose two mask-based beamforming methods using a dee...
research
11/08/2021

Learning Filterbanks for End-to-End Acoustic Beamforming

Recent work on monaural source separation has shown that performance can...
research
01/04/2021

Generalized RNN beamformer for target speech separation

Recently we proposed an all-deep-learning minimum variance distortionles...
research
05/07/2022

Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking

Beamforming is a powerful tool designed to enhance speech signals from t...
research
05/23/2020

Exploring Optimal DNN Architecture for End-to-End Beamformers Based on Time-frequency References

Acoustic beamformers have been widely used to enhance audio signals. Cur...
research
05/09/2019

Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

This paper addresses the problem of block-online processing for multi-ch...

Please sign up or login with your details

Forgot password? Click here to reset