DeepAI AI Chat
Log In Sign Up

Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer

by   Jie Wu, et al.

This paper presents a high quality singing synthesizer that is able to model a voice with limited available recordings. Based on the sequence-to-sequence singing model, we design a multi-singer framework to leverage all the existing singing data of different singers. To attenuate the issue of musical score unbalance among singers, we incorporate an adversarial task of singer classification to make encoder output less singer dependent. Furthermore, we apply multiple random window discriminators (MRWDs) on the generated acoustic features to make the network be a GAN. Both objective and subjective evaluations indicate that the proposed synthesizer can generate higher quality singing voice than baseline (4.12 vs 3.53 in MOS). Especially, the articulation of high-pitched vowels is significantly enhanced.


Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss

The neural network (NN) based singing voice synthesis (SVS) systems requ...

Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction

Target-speaker voice activity detection is currently a promising approac...

Neural Melody Composition from Lyrics

In this paper, we study a novel task that learns to compose music from n...

Voice Reconstruction from Silent Speech with a Sequence-to-Sequence Model

Silent Speech Decoding (SSD) based on Surface electromyography (sEMG) ha...

Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision

This paper presents methods of making using of text supervision to impro...

Talking Drums: Generating drum grooves with neural networks

Presented is a method of generating a full drum kit part for a provided ...

Singing voice synthesis based on frame-level sequence-to-sequence models considering vocal timing deviation

This paper proposes singing voice synthesis (SVS) based on frame-level s...