Based on principal-agent theory and optimal contract theory, companies u...
Blended learning is generally defined as the combination of traditional
...
Cross-modal retrieval (CMR) has been extensively applied in various doma...
Most existing sandstorm image enhancement methods are based on tradition...
This paper integrates graph-to-sequence into an end-to-end text-to-speec...
Generating realistic talking faces is a complex and widely discussed tas...
The rise of the phenomenon of the "right to be forgotten" has prompted
r...
Voice conversion is a method that allows for the transformation of speak...
Music Emotion Recognition involves the automatic identification of emoti...
This survey paper provides a comprehensive overview of the recent
advanc...
Voice conversion as the style transfer task applied to speech, refers to...
There has been significant progress in emotional Text-To-Speech (TTS)
sy...
In recent Text-to-Speech (TTS) systems, a neural vocoder often generates...
Because of predicting all the target tokens in parallel, the
non-autoreg...
Recent expressive text to speech (TTS) models focus on synthesizing emot...
Music genre classification has been widely studied in past few years for...
The recent emergence of joint CTC-Attention model shows significant
impr...
Recent advances in pre-trained language models have improved the perform...
Most previous neural text-to-speech (TTS) methods are mainly based on
su...
Metaverse expands the physical world to a new dimension, and the physica...
Recovering the masked speech frames is widely applied in speech
represen...
In this paper, we proposed Adapitch, a multi-speaker TTS method that mak...
Since the beginning of the COVID-19 pandemic, remote conferencing and
sc...
Nonparallel multi-domain voice conversion methods such as the StarGAN-VC...
Non-parallel many-to-many voice conversion remains an interesting but
ch...
Time-domain Transformer neural networks have proven their superiority in...
Singing voice synthesis is a generative task that involves multi-dimensi...
Recently, synthesizing personalized speech by text-to-speech (TTS)
appli...
Metaverse has stretched the real world into unlimited space. There will ...
Metaverse is an interactive world that combines reality and virtuality, ...
Any-to-any voice conversion problem aims to convert voices for source an...
Multi-speaker text-to-speech (TTS) using a few adaption data is a challe...
Voice Conversion(VC) refers to changing the timbre of a speech while
ret...
In this paper, we study the issue of automatic singer identification (SI...
Models for music artist classification usually were operated in the freq...
Singing voice detection is the task to identify the frames which contain...