WeSinger: Data-augmented Singing Voice Synthesis with Auxiliary Losses

03/21/2022
by   Zewang Zhang, et al.
0

In this paper, we develop a new multi-singer Chinese neural singing voice synthesis (SVS) system named WeSinger. To improve the accuracy and naturalness of synthesized singing voice, we design several specifical modules and techniques: 1) A deep bi-directional LSTM based duration model with multi-scale rhythm loss and post-processing step; 2) A Transformer-alike acoustic model with progressive pitch-weighted decoder loss; 3) a 24 kHz pitch-aware LPCNet neural vocoder to produce high-quality singing waveforms; 4) A novel data augmentation method with multi-singer pre-training for stronger robustness and naturalness. To our knowledge, WeSinger is the first SVS system to adopt 24 kHz LPCNet and multi-singer pre-training simultaneously. Both quantitative and qualitative evaluation results demonstrate the effectiveness of WeSinger in terms of accuracy and naturalness, and WeSinger achieves state-of-the-art performance on the recently public Chinese singing corpus Opencpop. Some synthesized singing samples are available online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/01/2023

Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training

The single-speaker singing voice synthesis (SVS) usually underperforms a...
research
04/23/2020

ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders

This paper presents ByteSing, a Chinese singing voice synthesis (SVS) sy...
research
12/20/2021

Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus

High-fidelity multi-singer singing voice synthesis is challenging for ne...
research
07/09/2020

DeepSinger: Singing Voice Synthesis with Data Mined From the Web

In this paper, we develop DeepSinger, a multi-lingual multi-singer singi...
research
03/25/2023

DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality

Vector font synthesis is a challenging and ongoing problem in the fields...
research
05/06/2021

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism

Singing voice synthesis (SVS) system is built to synthesize high-quality...
research
11/02/2022

Singing Voice Synthesis with Vibrato Modeling and Latent Energy Representation

This paper proposes an expressive singing voice synthesis system by intr...

Please sign up or login with your details

Forgot password? Click here to reset