Semi-supervised source localization in reverberant environments with deep generative modeling

01/26/2021
by   Michael J. Bianco, et al.
5

A semi-supervised approach to acoustic source localization in reverberant environments, based on deep generative modeling, is proposed. Localization in reverberant environments remains an open challenge. Even with large data volumes, the number of labels available for supervised learning in reverberant environments is usually small. We address this issue by performing semi-supervised learning (SSL) with convolutional variational autoencoders (VAEs) on speech signals in reverberant environments. The VAE is trained to generate the phase of relative transfer functions (RTFs) between microphones, in parallel with a direction of arrival (DOA) classifier based on RTF-phase, on both labeled and unlabeled RTF samples. In learning to perform these tasks, the VAE-SSL explicitly learns to separate the physical causes of the RTF-phase (i.e., source location) from distracting signal characteristics such as noise and speech activity. Relative to existing semi-supervised localization methods in acoustics, VAE-SSL is effectively an end-to-end processing approach which relies on minimal preprocessing of RTF-phase features. The VAE-SSL approach is compared with the steered response power with phase transform (SRP-PHAT) and fully supervised CNNs. We find that VAE-SSL can outperform both SRP-PHAT and CNN in label-limited scenarios. Further, the trained VAE-SSL system can generate new RTF-phase samples, which shows the VAE-SSL approach learns the physics of the acoustic environment. The generative modeling in VAE-SSL thus provides a means of interpreting the learned representations.

READ FULL TEXT

page 2

page 3

page 4

page 6

page 7

page 8

page 11

page 13

research
05/27/2020

Semi-supervised source localization with deep generative modeling

We develop a semi-supervised learning (SSL) approach for acoustic source...
research
09/16/2018

A Deep Generative Model for Semi-Supervised Classification with Noisy Labels

Class labels are often imperfectly observed, due to mistakes and to genu...
research
11/21/2020

SHOT-VAE: Semi-supervised Deep Generative Models With Label-aware ELBO Approximations

Semi-supervised variational autoencoders (VAEs) have obtained strong res...
research
08/29/2021

Deep Dive into Semi-Supervised ELBO for Improving Classification Performance

Decomposition of the evidence lower bound (ELBO) objective of VAE used f...
research
10/21/2020

Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm

We present a new approach to disentangle speaker voice and phone content...
research
06/12/2020

LaRVAE: Label Replacement VAE for Semi-Supervised Disentanglement Learning

Learning interpretable and disentangled representations is a crucial yet...
research
10/17/2018

PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences

Given the emerging global threat of antimicrobial resistance, new method...

Please sign up or login with your details

Forgot password? Click here to reset