ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation

05/29/2023
by   Ambuj Mehrish, et al.
0

There are significant challenges for speaker adaptation in text-to-speech for languages that are not widely spoken or for speakers with accents or dialects that are not well-represented in the training data. To address this issue, we propose the use of the "mixture of adapters" method. This approach involves adding multiple adapters within a backbone-model layer to learn the unique characteristics of different speakers. Our approach outperforms the baseline, with a noticeable improvement of 5 using only one minute of data for each new speaker. Moreover, following the adapter paradigm, we fine-tune only the adapter parameters (11 model parameters). This is a significant achievement in parameter-efficient speaker adaptation, and one of the first models of its kind. Overall, our proposed approach offers a promising solution to the speech synthesis techniques, particularly for adapting to speakers from diverse backgrounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2022

Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation

Adapting a neural text-to-speech (TTS) model to a target speaker typical...
research
11/01/2022

Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers

Fine-tuning is a popular method for adapting text-to-speech (TTS) models...
research
06/12/2023

Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation

In dysarthric speech recognition, data scarcity and the vast diversity b...
research
06/01/2022

AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation

Speaker adaptation in text-to-speech synthesis (TTS) is to finetune a pr...
research
04/06/2021

Optimal Transport-based Adaptation in Dysarthric Speech Tasks

In many real-world applications, the mismatch between distributions of t...
research
05/24/2022

TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS

Recently, synthesizing personalized speech by text-to-speech (TTS) appli...
research
08/07/2020

Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes

Multi-speaker speech synthesis is a technique for modeling multiple spea...

Please sign up or login with your details

Forgot password? Click here to reset