Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement

06/14/2023
by   Hejung Yang, et al.
0

Large, pre-trained representation models trained using self-supervised learning have gained popularity in various fields of machine learning because they are able to extract high-quality salient features from input data. As such, they have been frequently used as base networks for various pattern classification tasks such as speech recognition. However, not much research has been conducted on applying these types of models to the field of speech signal generation. In this paper, we investigate the feasibility of using pre-trained speech representation models for a downstream speech enhancement task. To alleviate mismatches between the input features of the pre-trained model and the target enhancement model, we adopt a novel feature normalization technique to smoothly link these modules together. Our proposed method enables significant improvements in speech quality compared to baselines when combined with various types of pre-trained speech models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2022

Exploring WavLM on Speech Enhancement

There is a surge in interest in self-supervised learning approaches for ...
research
08/28/2023

Rep2wav: Noise Robust text-to-speech Using self-supervised representations

Benefiting from the development of deep learning, text-to-speech (TTS) t...
research
06/22/2023

Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies

Automatic singing voice understanding tasks, such as singer identificati...
research
06/22/2022

A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement

Speech enhancement has seen great improvement in recent years using end-...
research
09/28/2022

Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization

With the development of deep learning, neural network-based speech enhan...
research
02/16/2023

Speech Enhancement with Multi-granularity Vector Quantization

With advances in deep learning, neural network based speech enhancement ...
research
11/09/2022

Speech separation with large-scale self-supervised learning

Self-supervised learning (SSL) methods such as WavLM have shown promisin...

Please sign up or login with your details

Forgot password? Click here to reset