SRIB-LEAP submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing

06/24/2021
by   R G Prithvi Raj, et al.
0

This paper presents the details of the SRIB-LEAP submission to the ConferencingSpeech challenge 2021. The challenge involved the task of multi-channel speech enhancement to improve the quality of far field speech from microphone arrays in a video conferencing room. We propose a two stage method involving a beamformer followed by single channel enhancement. For the beamformer, we incorporated self-attention mechanism as inter-channel processing layer in the filter-and-sum network (FaSNet), an end-to-end time-domain beamforming system. The single channel speech enhancement is done in log spectral domain using convolution neural network (CNN)-long short term memory (LSTM) based architecture. We achieved improvements in objective quality metrics - perceptual evaluation of speech quality (PESQ) of 0.5 on the noisy data. On subjective quality evaluation, the proposed approach improved the mean opinion score (MOS) by an absolute measure of 0.9 over the noisy audio.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2021

INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing

The ConferencingSpeech 2021 challenge is proposed to stimulate research ...
research
04/19/2022

Audio-Visual Wake Word Spotting System For MISP Challenge 2021

This paper presents the details of our system designed for the Task 1 of...
research
03/27/2018

Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline

This paper describes a new baseline system for automatic speech recognit...
research
07/23/2021

Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency Domain Features and a Pre-trained Acoustic Model

We propose a multi-channel speech enhancement approach with a novel two-...
research
05/31/2021

EchoFilter: End-to-End Neural Network for Acoustic Echo Cancellation

Acoustic Echo Cancellation (AEC) whose aim is to suppress the echo origi...
research
07/31/2018

Lip-Reading Driven Deep Learning Approach for Speech Enhancement

This paper proposes a novel lip-reading driven deep learning framework f...
research
10/23/2020

Dual-path Self-Attention RNN for Real-Time Speech Enhancement

We propose a dual-path self-attention recurrent neural network (DP-SARNN...

Please sign up or login with your details

Forgot password? Click here to reset