Ron J. Weiss

research

∙ 10/19/2022

G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR

Data augmentation is a ubiquitous technique used to provide robustness t...

0 Gary Wang, et al. ∙

research

∙ 06/17/2021

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

This paper introduces WaveGrad 2, a non-autoregressive generative model ...

0 Nanxin Chen, et al. ∙

research

∙ 06/01/2021

Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation

Supervised neural network training has led to significant progress on si...

0 Scott Wisdom, et al. ∙

research

∙ 11/06/2020

Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis

We describe a sequence-to-sequence neural network which can directly gen...

0 Ron J. Weiss, et al. ∙

research

∙ 10/27/2020

Multitask Training with Text Data for End-to-End Speech Recognition

We propose a multitask training method for attention-based end-to-end sp...

0 Peidong Wang, et al. ∙

research

∙ 09/02/2020

WaveGrad: Estimating Gradients for Waveform Generation

This paper introduces WaveGrad, a conditional model for waveform generat...

5 Nanxin Chen, et al. ∙

research

∙ 06/23/2020

Unsupervised Sound Separation Using Mixtures of Mixtures

In recent years, rapid progress has been made on the problem of single-c...

0 Scott Wisdom, et al. ∙

research

∙ 02/06/2020

Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis

This paper proposes a hierarchical, fine-grained and interpretable laten...

0 Guangzhi Sun, et al. ∙

research

∙ 02/06/2020

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior

Recent neural text-to-speech (TTS) models with fine-grained latent featu...

0 Guangzhi Sun, et al. ∙

research

∙ 07/09/2019

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

We present a multispeaker, multilingual text-to-speech (TTS) synthesis m...

0 Yu Zhang, et al. ∙

research

∙ 04/12/2019

Direct speech-to-speech translation with a sequence-to-sequence model

We present an attention-based sequence-to-sequence neural network which ...

0 Ye Jia, et al. ∙

research

∙ 04/08/2019

Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation

We describe Parrotron, an end-to-end-trained speech-to-speech conversion...

0 Fadi Biadsy, et al. ∙

research

∙ 04/05/2019

LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech

This paper introduces a new speech corpus called "LibriTTS" designed for...

0 Heiga Zen, et al. ∙

research

∙ 02/21/2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Lingvo is a Tensorflow framework offering a complete solution for collab...

13 Jonathan Shen, et al. ∙

research

∙ 02/19/2019

A spelling correction model for end-to-end speech recognition

Attention-based sequence-to-sequence models for speech recognition joint...

0 Jinxi Guo, et al. ∙

research

∙ 01/25/2019

Unsupervised speech representation learning using WaveNet autoencoders

We consider the task of unsupervised extraction of meaningful latent rep...

12 Jan Chorowski, et al. ∙

research

∙ 11/05/2018

Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation

End-to-end Speech Translation (ST) models have many potential advantages...

0 Ye Jia, et al. ∙

research

∙ 10/16/2018

Hierarchical Generative Modeling for Controllable Speech Synthesis

This paper proposes a neural end-to-end text-to-speech (TTS) model which...

0 Wei-Ning Hsu, et al. ∙

research

∙ 10/11/2018

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

In this paper, we present a novel system that separates the voice of a t...

0 Quan Wang, et al. ∙

research

∙ 06/20/2018

Synthesizing Diverse, High-Quality Audio Textures

Texture synthesis techniques based on matching the Gram matrix of featur...

0 Joseph Antognini, et al. ∙

research

∙ 06/12/2018

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

We describe a neural network-based system for text-to-speech (TTS) synth...

0 Ye Jia, et al. ∙

research

∙ 03/24/2018

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron

We present an extension to the Tacotron speech synthesis architecture th...

0 RJ Skerry-Ryan, et al. ∙

research

∙ 12/22/2017

On Using Backpropagation for Speech Texture Generation and Voice Conversion

Inspired by recent work on neural network image generation which rely on...

0 Jan Chorowski, et al. ∙

research

∙ 12/16/2017

Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions

This paper describes Tacotron 2, a neural network architecture for speec...

0 Jonathan Shen, et al. ∙

research

∙ 12/05/2017

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Attention-based encoder-decoder architectures such as Listen, Attend, an...

0 Chung-Cheng Chiu, et al. ∙

research

∙ 11/06/2017

Multilingual Speech Recognition With A Single End-To-End Model

Training a conventional automatic speech recognition (ASR) system to sup...

0 Shubham Toshniwal, et al. ∙

research

∙ 04/03/2017

Online and Linear-Time Attention by Enforcing Monotonic Alignments

Recurrent neural network models with an attention mechanism have proven ...

0 Colin Raffel, et al. ∙

research

∙ 03/29/2017

Tacotron: Towards End-to-End Speech Synthesis

A text-to-speech synthesis system typically consists of multiple stages,...

0 Yuxuan Wang, et al. ∙

research

∙ 03/24/2017

Sequence-to-Sequence Models Can Directly Translate Foreign Speech

We present a recurrent encoder-decoder deep neural network architecture ...

0 Ron J. Weiss, et al. ∙

research

∙ 09/29/2016

CNN Architectures for Large-Scale Audio Classification

Convolutional Neural Networks (CNNs) have proven very effective in image...

0 Shawn Hershey, et al. ∙

Ron J. Weiss

Featured Co-authors

Sign in with Google

Consider DeepAI Pro