Deep Learning for Distant Speech Recognition

12/17/2017
by   Mirco Ravanelli, et al.
0

Deep learning is an emerging technology that is considered one of the most promising directions for reaching higher levels of artificial intelligence. Among the other achievements, building computers that understand speech represents a crucial leap towards intelligent machines. Despite the great efforts of the past decades, however, a natural and robust human-machine speech interaction still appears to be out of reach, especially when users interact with a distant microphone in noisy and reverberant environments. The latter disturbances severely hamper the intelligibility of a speech signal, making Distant Speech Recognition (DSR) one of the major open challenges in the field. This thesis addresses the latter scenario and proposes some novel techniques, architectures, and algorithms to improve the robustness of distant-talking acoustic models. We first elaborate on methodologies for realistic data contamination, with a particular emphasis on DNN training with simulated data. We then investigate on approaches for better exploiting speech contexts, proposing some original methodologies for both feed-forward and recurrent neural networks. Lastly, inspired by the idea that cooperation across different DNNs could be the key for counteracting the harmful effects of noise and reverberation, we propose a novel deep learning paradigm called network of deep neural networks. The analysis of the original concepts were based on extensive experimental validations conducted on both real and simulated data, considering different corpora, microphone configurations, environments, noisy conditions, and ASR tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2018

Automatic context window composition for distant speech recognition

Distant speech recognition is being revolutionized by deep learning, tha...
research
11/26/2017

Realistic multi-microphone data simulation for distant speech recognition

The availability of realistic simulated corpora is of key importance for...
research
10/10/2017

Contaminated speech training methods for robust DNN-HMM distant speech recognition

Despite the significant progress made in the last years, state-of-the-ar...
research
10/15/2019

Analyzing Large Receptive Field Convolutional Networks for Distant Speech Recognition

Despite significant efforts over the last few years to build a robust au...
research
03/23/2017

A network of deep neural networks for distant speech recognition

Despite the remarkable progress recently made in distant speech recognit...
research
04/06/2021

Learning to Rank Microphones for Distant Speech Recognition

Fully exploiting ad-hoc microphone networks for distant speech recogniti...
research
03/26/2018

Light Gated Recurrent Units for Speech Recognition

A field that has directly benefited from the recent advances in deep lea...

Please sign up or login with your details

Forgot password? Click here to reset