Exploring Optimal DNN Architecture for End-to-End Beamformers Based on Time-frequency References

05/23/2020
by   Yuichiro Koyama, et al.
0

Acoustic beamformers have been widely used to enhance audio signals. Currently, the best methods are the deep neural network (DNN)-powered variants of the generalized eigenvalue and minimum-variance distortionless response beamformers and the DNN-based filter-estimation methods that are used to directly compute beamforming filters. Both approaches are effective; however, they have blind spots in their generalizability. Therefore, we propose a novel approach for combining these two methods into a single framework that attempts to exploit the best features of both. The resulting model, called the W-Net beamformer, includes two components; the first computes time-frequency references that the second uses to estimate beamforming filters. The results on data that include a wide variety of room and noise conditions, including static and mobile noise sources, show that the proposed beamformer outperforms other methods on all tested evaluation metrics, which signifies that the proposed architecture allows for effective computation of the beamforming filters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2019

W-Net BF: DNN-based Beamformer Using Joint Training Approach

Acoustic beamformers have been widely used to enhance audio signals. The...
research
08/16/2021

Convolutive Prediction for Reverberant Speech Separation

We investigate the effectiveness of convolutive prediction, a novel form...
research
07/22/2022

DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF

This paper describes a practical dual-process speech enhancement system ...
research
03/03/2022

Deep Learning-Based Joint Control of Acoustic Echo Cancellation, Beamforming and Postfiltering

We introduce a novel method for controlling the functionality of a hands...
research
04/17/2019

Deep Filtering: Signal Extraction Using Complex Time-Frequency Filters

Signal extraction from a single-channel mixture with additional undesire...
research
01/01/2019

Exploring spectro-temporal features in end-to-end convolutional neural networks

Triangular, overlapping Mel-scaled filters ("f-banks") are the current s...
research
05/07/2022

Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking

Beamforming is a powerful tool designed to enhance speech signals from t...

Please sign up or login with your details

Forgot password? Click here to reset