Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

12/18/2017
by   Santiago Pascual, et al.
0

Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments an important topic. In this work, we present the results of adapting a speech enhancement generative adversarial network by finetuning the generator with small amounts of data. We investigate the minimum requirements to obtain a stable behavior in terms of several objective metrics in two very different languages: Catalan and Korean. We also study the variability of test performance to unseen noise as a function of the amount of different types of noise available for training. Results show that adapting a pre-trained English model with 10 min of data already achieves a comparable performance to having two orders of magnitude more data. They also demonstrate the relative stability in test performance with respect to the number of training noise types.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2017

SEGAN: Speech Enhancement Generative Adversarial Network

Current speech enhancement techniques operate on the spectral domain and...
research
07/27/2020

On the Use of Audio Fingerprinting Features for Speech Enhancement with Generative Adversarial Network

The advent of learning-based methods in speech enhancement has revived t...
research
02/20/2020

iSEGAN: Improved Speech Enhancement Generative Adversarial Networks

Popular neural network-based speech enhancement systems operate on the m...
research
03/23/2022

MetricGAN+/-: Increasing Robustness of Noise Reduction on Unseen Data

Training of speech enhancement systems often does not incorporate knowle...
research
03/03/2020

Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data

While deep learning systems have gained significant ground in speech enh...
research
09/06/2021

Machine Learning: Challenges, Limitations, and Compatibility for Audio Restoration Processes

In this paper machine learning networks are explored for their use in re...
research
04/02/2020

iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning

The intelligibility of natural speech is seriously degraded when exposed...

Please sign up or login with your details

Forgot password? Click here to reset