Semi-Supervised Learning Based on Reference Model for Low-resource TTS

10/25/2022
by   Xulong Zhang, et al.
0

Most previous neural text-to-speech (TTS) methods are mainly based on supervised learning methods, which means they depend on a large training dataset and hard to achieve comparable performance under low-resource conditions. To address this issue, we propose a semi-supervised learning method for neural TTS in which labeled target data is limited, which can also resolve the problem of exposure bias in the previous auto-regressive models. Specifically, we pre-train the reference model based on Fastspeech2 with much source data, fine-tuned on a limited target dataset. Meanwhile, pseudo labels generated by the original reference model are used to guide the fine-tuned model's training further, achieve a regularization effect, and reduce the overfitting of the fine-tuned model during training on the limited target data. Experimental results show that our proposed semi-supervised learning scheme with limited target data significantly improves the voice quality for test data to achieve naturalness and robustness in speech synthesis.

READ FULL TEXT
research
10/30/2021

Pseudo-Labeling for Massively Multilingual Speech Recognition

Semi-supervised learning through pseudo-labeling has become a staple of ...
research
04/02/2023

Semi-supervised Neural Machine Translation with Consistency Regularization for Low-Resource Languages

The advent of deep learning has led to a significant gain in machine tra...
research
02/07/2020

Snippext: Semi-supervised Opinion Mining with Augmented Data

Online services are interested in solutions to opinion mining, which is ...
research
10/13/2020

Towards Data-efficient Modeling for Wake Word Spotting

Wake word (WW) spotting is challenging in far-field not only because of ...
research
08/07/2023

Universal Automatic Phonetic Transcription into the International Phonetic Alphabet

This paper presents a state-of-the-art model for transcribing speech in ...
research
09/13/2018

Sparse Label Smoothing for Semi-supervised Person Re-Identification

In this paper, we propose a semi-supervised framework to address the ove...
research
10/02/2018

Semi-supervised and Active-learning Scenarios: Efficient Acoustic Model Refinement for a Low Resource Indian Language

We address the problem of efficient acoustic-model refinement (continuou...

Please sign up or login with your details

Forgot password? Click here to reset