The IQIYI System for Voice Conversion Challenge 2020

10/29/2020
by   Wendong Gan, et al.
0

This paper presents the IQIYI voice conversion system (T24) for Voice Conversion 2020. In the competition, each target speaker has 70 sentences. We have built an end-to-end voice conversion system based on PPG. First, the ASR acoustic model calculates the BN feature, which represents the content-related information in the speech. Then the Mel feature is calculated through an improved prosody tacotron model. Finally, the Mel spectrum is converted to wav through an improved LPCNet. The evaluation results show that this system can achieve better voice conversion effects. In the case of using 16k rather than 24k sampling rate audio, the conversion result is relatively good in naturalness and similarity. Among them, our best results are in the similarity evaluation of the Task 2, the 2nd in the ASV-based objective evaluation and the 5th in the subjective evaluation.

READ FULL TEXT

page 2

page 4

research
10/06/2020

The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS

This paper presents the sequence-to-sequence (seq2seq) baseline system f...
research
11/12/2021

AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion

This paper presents AC-VC (Almost Causal Voice Conversion), a phonetic p...
research
01/29/2022

The HCCL-DKU system for fake audio generation task of the 2022 ICASSP ADD Challenge

The voice conversion task is to modify the speaker identity of continuou...
research
10/28/2020

PPG-based singing voice conversion with adversarial representation learning

Singing voice conversion (SVC) aims to convert the voice of one singer t...
research
12/27/2022

Voice conversion with limited data and limitless data augmentations

Applying changes to an input speech signal to change the perceived speak...
research
06/11/2023

Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion

Electrolarynx is a commonly used assistive device to help patients with ...
research
08/19/2023

Effects of Convolutional Autoencoder Bottleneck Width on StarGAN-based Singing Technique Conversion

Singing technique conversion (STC) refers to the task of converting from...

Please sign up or login with your details

Forgot password? Click here to reset