Improvement of Noise-Robust Single-Channel Voice Activity Detection with Spatial Pre-processing

04/12/2021
by   Max Væhrens, et al.
0

Voice activity detection (VAD) remains a challenge in noisy environments. With access to multiple microphones, prior studies have attempted to improve the noise robustness of VAD by creating multi-channel VAD (MVAD) methods. However, MVAD is relatively new compared to single-channel VAD (SVAD), which has been thoroughly developed in the past. It might therefore be advantageous to improve SVAD methods with pre-processing to obtain superior VAD, which is under-explored. This paper improves SVAD through two pre-processing methods, a beamformer and a spatial target speaker detector. The spatial detector sets signal frames to zero when no potential speaker is present within a target direction. The detector may be implemented as a filter, meaning the input signal for the SVAD is filtered according to the detector's output; or it may be implemented as a spatial VAD to be combined with the SVAD output. The evaluation is made on a noisy reverberant speech database, with clean speech from the Aurora 2 database and with white and babble noise. The results show that SVAD algorithms are significantly improved by the presented pre-processing methods, especially the spatial detector, across all signal-to-noise ratios. The SVAD algorithms with pre-processing significantly outperform a baseline MVAD in challenging noise conditions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2017

Can We Boost the Power of the Viola-Jones Face Detector Using Pre-processing? An Empirical Study

The Viola-Jones face detection algorithm was (and still is) a quite popu...
research
03/18/2021

Optimally Summarizing Data by Small Fact Sets for Concise Answers to Voice Queries

Our goal is to find combinations of facts that optimally summarize data ...
research
03/06/2014

Design a Persian Automated Plagiarism Detector (AMZPPD)

Currently there are lots of plagiarism detection approaches. But few of ...
research
10/10/2000

On a cepstrum-based speech detector robust to white noise

We study effects of additive white noise on the cepstral representation ...
research
04/29/2019

Mixture of Pre-processing Experts Model for Noise Robust Deep Learning on Resource Constrained Platforms

Deep learning on an edge device requires energy efficient operation due ...
research
07/24/2023

Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains

Voice activity and overlapped speech detection (respectively VAD and OSD...
research
07/28/2022

Deep Learning-Based Acoustic Mosquito Detection in Noisy Conditions Using Trainable Kernels and Augmentations

In this paper, we demonstrate a unique recipe to enhance the effectivene...

Please sign up or login with your details

Forgot password? Click here to reset