Robust Speech Recognition Using Generative Adversarial Networks

11/05/2017
by   Anuroop Sriram, et al.
0

This paper describes a general, scalable, end-to-end framework that uses the generative adversarial network (GAN) objective to enable robust speech recognition. Encoders trained with the proposed approach enjoy improved invariance by learning to map noisy audio to the same embedding space as that of clean audio. Unlike previous methods, the new framework does not rely on domain expertise or simplifying assumptions as are often needed in signal processing, and directly encourages robustness in a data-driven way. We show the new approach improves simulated far-field speech recognition of vanilla sequence-to-sequence models without specialized front-ends or preprocessing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/13/2018

Modality Attention for End-to-End Audio-visual Speech Recognition

Audio-visual speech recognition (AVSR) system is thought to be one of th...
research
01/13/2022

The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition

In this paper, we investigate several existing and a new state-of-the-ar...
research
11/27/2019

AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition

As one of the major sources in speech variability, accents have posed a ...
research
06/05/2019

Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition

Several audio-visual speech recognition models have been recently propos...
research
05/29/2020

Improving EEG based continuous speech recognition using GAN

In this paper we demonstrate that it is possible to generate more meanin...
research
06/16/2017

An online sequence-to-sequence model for noisy speech recognition

Generative models have long been the dominant approach for speech recogn...
research
04/02/2020

Temporarily-Aware Context Modelling using Generative Adversarial Networks for Speech Activity Detection

This paper presents a novel framework for Speech Activity Detection (SAD...

Please sign up or login with your details

Forgot password? Click here to reset