Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

05/09/2019
by   Jiri Malek, et al.
0

This paper addresses the problem of block-online processing for multi-channel speech enhancement. We consider several variants of a system that performs beamforming supported by DNN-based Voice Activity Detection followed by post-filtering. The speaker is targeted through estimating relative transfer functions between microphones. Each block of the input signals is processed independently in order to make the method applicable in highly dynamic environments. The performance loss caused by the short length of the processing block is studied and compared with results achieved when recordings are processed as one block (batch processing). The experimental evaluation of the proposed method is performed on large datasets of CHiME-4 and on another dataset featuring moving target speaker. The experiments are evaluated in terms of objective criteria and Word Error Rate achieved by a baseline Automatic Speech Recognition system, for which the enhancement method serves as a front-end solution. The results indicate that the proposed method is robust with respect to the length of the processing block and yields significant WER improvement even for a short block length of 250 ms.

READ FULL TEXT

page 1

page 10

research
05/09/2022

Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition

Improving the accuracy of single-channel automatic speech recognition (A...
research
05/03/2022

On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training

In this paper, we explore an improved framework to train a monoaural neu...
research
02/24/2021

Speech Enhancement Using Multi-Stage Self-Attentive Temporal Convolutional Networks

Multi-stage learning is an effective technique to invoke multiple deep-l...
research
07/22/2022

DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF

This paper describes a practical dual-process speech enhancement system ...
research
07/31/2020

Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones

A novel framework for meeting transcription using asynchronous microphon...
research
04/21/2022

STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency

Deep learning based speech enhancement in the short-term Fourier transfo...
research
07/15/2022

Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments

This paper describes the practical response- and performance-aware devel...

Please sign up or login with your details

Forgot password? Click here to reset