The inverse short-time Fourier transform network (iSTFTNet) has garnered...
In speech synthesis, a generative adversarial network (GAN), training a
...
Voice conversion is a task to convert a non-linguistic feature of a give...
This paper proposes a new voice conversion (VC) task from human speech t...
In recent text-to-speech synthesis and voice conversion systems, a
mel-s...
This paper proposes a new source model and training scheme to improve th...
Preserving the linguistic content of input speech is essential during vo...
This paper proposes a non-autoregressive extension of our previously pro...
This paper shows that StarGAN-VC, a spectral envelope transformation met...
Non-parallel voice conversion (VC) is a technique for training voice
con...
Non-parallel voice conversion (VC) is a technique for learning mappings
...
In this paper, we propose a non-parallel any-to-many voice conversion (V...
Deep neural networks (DNNs) have achieved substantial predictive perform...
We have previously proposed a method that allows for non-parallel voice
...
Sequence-to-sequence (seq2seq) voice conversion (VC) models are attracti...
This paper proposes a voice conversion (VC) method based on a
sequence-t...
We introduce a novel sequence-to-sequence (seq2seq) voice conversion (VC...
Automatic speaker verification (ASV) is one of the most natural and
conv...
Non-parallel multi-domain voice conversion (VC) is a technique for learn...
Non-parallel voice conversion (VC) is a technique for learning the mappi...
Humans are able to imagine a person's voice from the person's appearance...
WaveCycleGAN has recently been proposed to bridge the gap between natura...
Recently, we proposed short-time Fourier transform (STFT)-based loss
fun...
This paper proposes an alternative algorithm for multichannel variationa...
This paper describes a method based on a sequence-to-sequence learning
(...
This paper proposes a voice conversion method based on fully convolution...
This paper deals with a multichannel audio source separation problem und...
We propose a learning-based filter that allows us to directly modify a
s...
This paper proposes a non-parallel many-to-many voice conversion (VC) me...
This paper proposes a multichannel source separation method called the
m...
This paper proposes a method that allows for non-parallel many-to-many v...
In this paper, we address the problem of reconstructing a time-domain si...
This paper proposes a method for generating speech from filterbank mel
f...
We propose a parallel-data-free voice conversion (VC) method that can le...
This paper provides a generic framework of component analysis (CA) metho...