Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

05/30/2017
by   Zixing Zhang, et al.
0

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition but still remains an important challenge. Data-driven supervised approaches, especially the ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2018

Investigations on End-to-End Audiovisual Fusion

Audiovisual speech recognition (AVSR) is a method to alleviate the adver...
research
06/18/2019

Deep Xi as a Front-End for Robust Automatic Speech Recognition

Front-end techniques for robust automatic speech recognition (ASR) have ...
research
09/13/2017

Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems

Neural models have become ubiquitous in automatic speech recognition sys...
research
02/19/2021

End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study

A key desiderata for inclusive and accessible speech recognition technol...
research
12/17/2014

Deep Speech: Scaling up end-to-end speech recognition

We present a state-of-the-art speech recognition system developed using ...
research
09/04/2023

SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge

Recently, excellent progress has been made in speech recognition. Howeve...
research
04/02/2019

End-to-End Visual Speech Recognition for Small-Scale Datasets

Traditional visual speech recognition systems consist of two stages, fea...

Please sign up or login with your details

Forgot password? Click here to reset