Self-Supervised Learning for Personalized Speech Enhancement

04/05/2021
by   Aswin Sivaraman, et al.
0

Speech enhancement systems can show improved performance by adapting the model towards a single test-time speaker. In this personalization context, the test-time user might only provide a small amount of noise-free speech data, likely insufficient for traditional fully-supervised learning. One way to overcome the lack of personal data is to transfer the model parameters from a speaker-agnostic model to initialize the personalized model, and then to finetune the model using the small amount of personal speech data. This baseline marginally adapts over the scarce clean speech data. Alternatively, we propose self-supervised methods that are designed specifically to learn personalized and discriminative features from abundant in-the-wild noisy, but still personal speech recordings. Our experiment shows that the proposed self-supervised learning methods initialize personalized speech enhancement models better than the baseline fully-supervised methods, yielding superior speech enhancement performance. The proposed methods also result in a more robust feature set under the real-world conditions: compressed model sizes and fewness of the labeled data.

READ FULL TEXT
research
04/05/2021

Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification

Training personalized speech enhancement models is innately a no-shot le...
research
11/06/2020

Self-Supervised Learning from Contrastive Mixtures for Personalized Speech Enhancement

This work explores how self-supervised learning can be universally used ...
research
11/14/2022

The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement

With the advances in deep learning, speech enhancement systems benefited...
research
07/05/2023

Self-supervised learning with diffusion-based multichannel speech enhancement for speaker verification under noisy conditions

The paper introduces Diff-Filter, a multichannel speech enhancement appr...
research
12/10/2021

Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Teleconferencing is becoming essential during the COVID-19 pandemic. How...
research
06/18/2020

Self-supervised Learning for Speech Enhancement

Supervised learning for single-channel speech enhancement requires caref...
research
05/08/2021

Test-Time Adaptation Toward Personalized Speech Enhancement: Zero-Shot Learning with Knowledge Distillation

In realistic speech enhancement settings for end-user devices, we often ...

Please sign up or login with your details

Forgot password? Click here to reset