Unsupervised Multi-channel Separation and Adaptation

05/18/2023
by   Cong Han, et al.
0

A key challenge in machine learning is to generalize from training data to an application domain of interest. This work generalizes the recently-proposed mixture invariant training (MixIT) algorithm to perform unsupervised learning in the multi-channel setting. We use MixIT to train a model on far-field microphone array recordings of overlapping reverberant and noisy speech from the AMI Corpus. The models are trained on both supervised and unsupervised training data, and are tested on real AMI recordings containing overlapping speech. To objectively evaluate our models, we also use a synthetic multi-channel AMI test set. Holding network architectures constant, we find that a fine-tuned semi-supervised model yields the largest improvement to SI-SNR and to human listening ratings across synthetic and real datasets, outperforming supervised models trained on well-matched synthetic data. Our results demonstrate that unsupervised learning through MixIT enables model adaptation on both single- and multi-channel real-world speech recordings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2021

Adapting Speech Separation to Real-World Meetings Using Mixture Invariant Training

The recently-proposed mixture invariant training (MixIT) is an unsupervi...
research
04/07/2022

Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation

Existing multi-channel continuous speech separation (CSS) models are hea...
research
02/17/2022

Multi-Channel Speech Denoising for Machine Ears

This work describes a speech denoising system for machine ears that aims...
research
11/28/2019

Unsupervised Neural Mask Estimator For Generalized Eigen-Value Beamforming Based ASR

The state-of-art methods for acoustic beamforming in multi-channel ASR a...
research
10/26/2020

Robust Disentanglement of a Few Factors at a Time

Disentanglement is at the forefront of unsupervised learning, as disenta...
research
12/10/2020

Data-Efficient Framework for Real-world Multiple Sound Source 2D Localization

Deep neural networks have recently led to promising results for the task...
research
03/25/2020

Unsupervised Learning for security of Enterprise networks by micro-segmentation

Micro-segmentation is a network security technique that requires deliver...

Please sign up or login with your details

Forgot password? Click here to reset