Yuki Saito

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Hiroshi Saruwatari
76 publications
Dong Yang
63 publications
Shinnosuke Takamichi
50 publications
Kenji Fukumizu
45 publications
Shin-ichi Maeda
23 publications
Nontawat Charoenphakdee
19 publications
Daichi Kitamura
19 publications
Kohei Hayashi
19 publications
Takaaki Saeki
18 publications
Norihiro Takamune
17 publications
Masanari Kimura
16 publications

research

∙ 06/21/2023

HumanDiffusion: diffusion model using perceptual gradients

We propose HumanDiffusion, a diffusion model trained from humans' percep...

0 Yota Ueda, et al. ∙

research

∙ 06/19/2023

Virtual Human Generative Model: Masked Modeling Approach for Learning Human Characteristics

Identifying the relationship between healthcare attributes, lifestyles, ...

0 Kenta Oono, et al. ∙

research

∙ 05/23/2023

ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings

We propose ChatGPT-EDSS, an empathetic dialogue speech synthesis (EDSS) ...

0 Yuki Saito, et al. ∙

research

∙ 05/23/2023

CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center

We present CALLS, a Japanese speech corpus that considers phone calls in...

0 Yuki Saito, et al. ∙

research

∙ 02/27/2023

Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech

Pause insertion, also known as phrase break prediction and phrasing, is ...

0 Dong Yang, et al. ∙

research

∙ 10/18/2022

Mid-attribute speaker generation using optimal-transport-based interpolation of Gaussian mixture models

In this paper, we propose a method for intermediating multiple speakers'...

0 Aya Watanabe, et al. ∙

research

∙ 09/26/2022

Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech

We propose a novel training algorithm for a multi-speaker neural text-to...

0 Yusuke Nakai, et al. ∙

research

∙ 06/21/2022

Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS

This paper proposes a human-in-the-loop speaker-adaptation method for mu...

0 Kenta Udagawa, et al. ∙

research

∙ 06/16/2022

Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History

We propose an end-to-end empathetic dialogue speech synthesis (DSS) mode...

0 Yuto Nishimura, et al. ∙

research

∙ 03/28/2022

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

We present STUDIES, a new speech corpus for developing a voice agent tha...

0 Yuki Saito, et al. ∙

research

∙ 08/30/2021

SHIFT15M: Multiobjective Large-Scale Fashion Dataset with Distributional Shifts

Many machine learning algorithms assume that the training data and the t...

0 Masanari Kimura, et al. ∙

research

∙ 02/08/2021

HumanACGAN: conditional generative adversarial network with human-based auxiliary classifier and its evaluation in phoneme perception

We propose a conditional generative adversarial network (GAN) incorporat...

0 Yota Ueda, et al. ∙

research

∙ 02/17/2020

Lifter Training and Sub-band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials

In this paper, we propose computationally efficient and high-quality met...

0 Takaaki Saeki, et al. ∙

research

∙ 10/22/2019

Deep Set-to-Set Matching and Learning

Matching two sets of items, called set-to-set matching problem, is being...

0 Yuki Saito, et al. ∙

research

∙ 09/25/2019

HumanGAN: generative adversarial network with human-based discriminator and its evaluation in speech perception modeling

We propose the HumanGAN, a generative adversarial network (GAN) incorpor...

0 Kazuki Fujii, et al. ∙

research

∙ 08/17/2019

JVS corpus: free Japanese multi-speaker voice corpus

Thanks to improvements in machine learning techniques, including deep le...

0 Shinnosuke Takamichi, et al. ∙

research

∙ 08/05/2019

V2S attack: building DNN-based voice conversion from automatic speaker verification

This paper presents a new voice impersonation attack using voice convers...

0 Taiki Nakamura, et al. ∙

research

∙ 07/19/2019

DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis

This paper proposes novel algorithms for speaker embedding using subject...

4 Yuki Saito, et al. ∙

research

∙ 02/09/2019

Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking

This paper proposes a generative moment matching network (GMMN)-based po...

0 Hiroki Tamaru, et al. ∙

research

∙ 07/10/2018

Phase reconstruction from amplitude spectrograms based on von-Mises-distribution deep neural network

This paper presents a deep neural network (DNN)-based phase reconstructi...

0 Shinnosuke Takamichi, et al. ∙

research

∙ 09/23/2017

Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks

A method for statistical parametric speech synthesis incorporating gener...

0 Yuki Saito, et al. ∙

research

∙ 04/10/2017

Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities

Voice conversion (VC) using sequence-to-sequence learning of context pos...

0 Hiroyuki Miyoshi, et al. ∙

Success!

An error occurred

Yuki Saito

Featured Co-authors

Sign in with Google

Consider DeepAI Pro