AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization

10/31/2020
by   Yen-Hao Chen, et al.
0

Recently, voice conversion (VC) has been widely studied. Many VC systems use disentangle-based learning techniques to separate the speaker and the linguistic content information from a speech signal. Subsequently, they convert the voice by changing the speaker information to that of the target speaker. To prevent the speaker information from leaking into the content embeddings, previous works either reduce the dimension or quantize the content embedding as a strong information bottleneck. These mechanisms somehow hurt the synthesis quality. In this work, we propose AGAIN-VC, an innovative VC system using Activation Guidance and Adaptive Instance Normalization. AGAIN-VC is an auto-encoder-based model, comprising of a single encoder and a decoder. With a proper activation as an information bottleneck on content embeddings, the trade-off between the synthesis quality and the speaker similarity of the converted speech is improved drastically. This one-shot VC system obtains the best performance regardless of the subjective or objective evaluations.

READ FULL TEXT
research
06/07/2020

VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net architecture

Voice conversion (VC) is a task that transforms the source speaker's tim...
research
03/16/2023

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion

Voice Conversion (VC) must be achieved while maintaining the content of ...
research
02/21/2022

AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning

Voice Conversion(VC) refers to changing the timbre of a speech while ret...
research
06/15/2022

End-to-End Voice Conversion with Information Perturbation

The ideal goal of voice conversion is to convert the source speaker's sp...
research
02/16/2020

Speech-to-Singing Conversion in an Encoder-Decoder Framework

In this paper our goal is to convert a set of spoken lines into sung one...
research
06/28/2023

UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data

We propose UnitSpeech, a speaker-adaptive speech synthesis method that f...
research
10/31/2022

VoicePrivacy 2022 System Description: Speaker Anonymization with Feature-matched F0 Trajectories

We introduce a novel method to improve the performance of the VoicePriva...

Please sign up or login with your details

Forgot password? Click here to reset