Statistical Beamformer Exploiting Non-stationarity and Sparsity with Spatially Constrained ICA for Robust Speech Recognition

06/13/2023
by   Ui-Hyeop Shin, et al.
0

In this paper, we present a statistical beamforming algorithm as a pre-processing step for robust automatic speech recognition (ASR). By modeling the target speech as a non-stationary Laplacian distribution, a mask-based statistical beamforming algorithm is proposed to exploit both its output and masked input variance for robust estimation of the beamformer. In addition, we also present a method for steering vector estimation (SVE) based on a noise power ratio obtained from the target and noise outputs in independent component analysis (ICA). To update the beamformer in the same ICA framework, we derive ICA with distortionless and null constraints on target speech, which yields beamformed speech at the target output and noises at the other outputs, respectively. The demixing weights for the target output result in a statistical beamformer with the weighted spatial covariance matrix (wSCM) using a weighting function characterized by a source model. To enhance the SVE, the strict null constraints imposed by the Lagrange multiplier methods are relaxed by generalized penalties with weight parameters, while the strict distortionless constraints are maintained. Furthermore, we derive an online algorithm based on an optimization technique of recursive least squares (RLS) for practical applications. Experimental results on various environments using CHiME-4 and LibriCSS datasets demonstrate the effectiveness of the presented algorithm compared to conventional beamforming and blind source extraction (BSE) based on ICA on both batch and online processing.

READ FULL TEXT
research
03/22/2019

Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition

This paper describes multichannel speech enhancement for improving autom...
research
04/12/2016

Noise Robust Speech Recognition Using Multi-Channel Based Channel Selection And ChannelWeighting

In this paper, we study several microphone channel selection and weighti...
research
06/19/2018

Speaker Adapted Beamforming for Multi-Channel Automatic Speech Recognition

This paper presents, in the context of multi-channel ASR, a method to ad...
research
04/26/2022

Mask scalar prediction for improving robust automatic speech recognition

Using neural network based acoustic frontends for improving robustness o...
research
03/09/2015

Modeling State-Conditional Observation Distribution using Weighted Stereo Samples for Factorial Speech Processing Models

This paper investigates the effectiveness of factorial speech processing...
research
01/04/2021

Generalized RNN beamformer for target speech separation

Recently we proposed an all-deep-learning minimum variance distortionles...

Please sign up or login with your details

Forgot password? Click here to reset