The AS-NU System for the M2VoC Challenge

by   Cheng-Hung Hu, et al.

This paper describes the AS-NU systems for two tracks in MultiSpeaker Multi-Style Voice Cloning Challenge (M2VoC). The first track focuses on using a small number of 100 target utterances for voice cloning, while the second track focuses on using only 5 target utterances for voice cloning. Due to the serious lack of data in the second track, we selected the speaker most similar to the target speaker from the training data of the TTS system, and used the speaker's utterances and the given 5 target utterances to fine-tune our model. The evaluation results show that our systems on the two tracks perform similarly in terms of quality, but there is still a clear gap between the similarity score of the second track and the similarity score of the first track.


page 1

page 2

page 3

page 4


CUHK-EE Voice Cloning System for ICASSP 2021 M2VoC Challenge

This paper presents the CUHK-EE voice cloning system for ICASSP 2021 M2V...

XMUSPEECH System for VoxCeleb Speaker Recognition Challenge 2021

This paper describes the XMUSPEECH speaker recognition and diarisation s...

Improving Voice Trigger Detection with Metric Learning

Voice trigger detection is an important task, which enables activating a...

Automatic Speaker Independent Dysarthric Speech Intelligibility Assessment System

Dysarthria is a condition which hampers the ability of an individual to ...

Deep Autotuner: A Data-Driven Approach to Natural-Sounding Pitch Correction for Singing Voice in Karaoke Performances

We describe a machine-learning approach to pitch correcting a solo singi...

Personalizing ASR with limited data using targeted subset selection

We study the task of personalizing ASR models to a target non-native spe...

Third DIHARD Challenge Evaluation Plan

This paper introduces the third DIHARD challenge, the third in a series ...