Improve few-shot voice cloning using multi-modal learning

03/18/2022
by   Haitong Zhang, et al.
0

Recently, few-shot voice cloning has achieved a significant improvement. However, most models for few-shot voice cloning are single-modal, and multi-modal few-shot voice cloning has been understudied. In this paper, we propose to use multi-modal learning to improve the few-shot voice cloning performance. Inspired by the recent works on unsupervised speech representation, the proposed multi-modal system is built by extending Tacotron2 with an unsupervised speech representation module. We evaluate our proposed system in two few-shot voice cloning scenarios, namely few-shot text-to-speech(TTS) and voice conversion(VC). Experimental results demonstrate that the proposed multi-modal learning can significantly improve the few-shot voice cloning performance over their counterpart single-modal systems.

READ FULL TEXT
research
06/11/2023

Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion

Electrolarynx is a commonly used assistive device to help patients with ...
research
02/21/2022

AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning

Voice Conversion(VC) refers to changing the timbre of a speech while ret...
research
10/26/2021

ViDA-MAN: Visual Dialog with Digital Humans

We demonstrate ViDA-MAN, a digital-human agent for multi-modal interacti...
research
05/22/2018

A scene perception system for visually impaired based on object detection and classification using multi-modal DCNN

This paper represents a cost-effective scene perception system aimed tow...
research
01/06/2022

Multi-modal data fusion of Voice and EMG data for Robotic Control

Wearable electronic equipment is constantly evolving and is increasing t...
research
09/18/2019

Efficient Computation of Multi-Modal Public Transit Traffic Assignments using ULTRA

We study the problem of computing public transit traffic assignments in ...
research
08/24/2020

Multi-Modal End-User Programming of Web-Based Virtual Assistant Skills

While Alexa can perform over 100,000 skills on paper, its capability cov...

Please sign up or login with your details

Forgot password? Click here to reset