OSSEM: one-shot speaker adaptive speech enhancement using meta learning

11/10/2021
by   Cheng Yu, et al.
0

Although deep learning (DL) has achieved notable progress in speech enhancement (SE), further research is still required for a DL-based SE system to adapt effectively and efficiently to particular speakers. In this study, we propose a novel meta-learning-based speaker-adaptive SE approach (called OSSEM) that aims to achieve SE model adaptation in a one-shot manner. OSSEM consists of a modified transformer SE network and a speaker-specific masking (SSM) network. In practice, the SSM network takes an enrolled speaker embedding extracted using ECAPA-TDNN to adjust the input noisy feature through masking. To evaluate OSSEM, we designed a modified Voice Bank-DEMAND dataset, in which one utterance from the testing set was used for model adaptation, and the remaining utterances were used for testing the performance. Moreover, we set restrictions allowing the enhancement process to be conducted in real time, and thus designed OSSEM to be a causal SE system. Experimental results first show that OSSEM can effectively adapt a pretrained SE model to a particular speaker with only one utterance, thus yielding improved SE results. Meanwhile, OSSEM exhibits a competitive performance compared to state-of-the-art causal SE systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2022

Perceptual Contrast Stretching on Target Feature for Speech Enhancement

Speech enhancement (SE) performance has improved considerably since the ...
research
12/17/2020

Speech Enhancement with Zero-Shot Model Selection

Recent research on speech enhancement (SE) has seen the emergence of dee...
research
02/20/2023

Personalized speech enhancement combining band-split RNN and speaker attentive module

Target speaker information can be utilized in speech enhancement (SE) mo...
research
05/24/2020

SERIL: Noise Adaptive Speech Enhancement using Regularization-based Incremental Learning

Numerous noise adaptation techniques have been proposed to address the m...
research
06/18/2022

NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling

For deep learning-based speech enhancement (SE) systems, the training-te...
research
08/21/2020

CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

In this paper, we present a deep learning-based speech signal-processing...
research
07/19/2018

Noise Adaptive Speech Enhancement using Domain Adversarial Training

In this study, we propose a novel noise adaptive speech enhancement (SE)...

Please sign up or login with your details

Forgot password? Click here to reset