We propose HumanDiffusion, a diffusion model trained from humans'
percep...
Identifying the relationship between healthcare attributes, lifestyles, ...
We propose ChatGPT-EDSS, an empathetic dialogue speech synthesis (EDSS)
...
We present CALLS, a Japanese speech corpus that considers phone calls in...
Pause insertion, also known as phrase break prediction and phrasing, is ...
In this paper, we propose a method for intermediating multiple speakers'...
We propose a novel training algorithm for a multi-speaker neural
text-to...
This paper proposes a human-in-the-loop speaker-adaptation method for
mu...
We propose an end-to-end empathetic dialogue speech synthesis (DSS) mode...
We present STUDIES, a new speech corpus for developing a voice agent tha...
Many machine learning algorithms assume that the training data and the t...
We propose a conditional generative adversarial network (GAN) incorporat...
In this paper, we propose computationally efficient and high-quality met...
Matching two sets of items, called set-to-set matching problem, is being...
We propose the HumanGAN, a generative adversarial network (GAN) incorpor...
Thanks to improvements in machine learning techniques, including deep
le...
This paper presents a new voice impersonation attack using voice convers...
This paper proposes novel algorithms for speaker embedding using subject...
This paper proposes a generative moment matching network (GMMN)-based
po...
This paper presents a deep neural network (DNN)-based phase reconstructi...
A method for statistical parametric speech synthesis incorporating gener...
Voice conversion (VC) using sequence-to-sequence learning of context
pos...