A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition

06/13/2018
by   Hao Tang, et al.
0

Speech recognizers trained on close-talking speech do not generalize to distant speech and the word error rate degradation can be as large as 40 absolute. Most studies focus on tackling distant speech recognition as a separate problem, leaving little effort to adapting close-talking speech recognizers to distant speech. In this work, we review several approaches from a domain adaptation perspective. These approaches, including speech enhancement, multi-condition training, data augmentation, and autoencoders, all involve a transformation of the data between domains. We conduct experiments on the AMI data set, where these approaches can be realized under the same controlled setting. These approaches lead to different amounts of improvement under their respective assumptions. The purpose of this paper is to quantify and characterize the performance gap between the two domains, setting up the basis for studying adaptation of speech recognizers from close-talking speech to distant speech. Our results also have implications for improving distant speech recognition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2021

Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions

The deep learning based time-domain models, e.g. Conv-TasNet, have shown...
research
03/23/2017

A network of deep neural networks for distant speech recognition

Despite the remarkable progress recently made in distant speech recognit...
research
08/14/2020

Adaptation Algorithms for Speech Recognition: An Overview

We present a structured overview of adaptation algorithms for neural net...
research
03/24/2017

Batch-normalized joint training for DNN-based distant speech recognition

Improving distant speech recognition is a crucial step towards flexible ...
research
06/13/2018

Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition

The current trend in automatic speech recognition is to leverage large a...
research
12/09/2021

Domain Adaptation and Autoencoder Based Unsupervised Speech Enhancement

As a category of transfer learning, domain adaptation plays an important...
research
03/24/2022

Computing Optimal Location of Microphone for Improved Speech Recognition

It was shown in our earlier work that the measurement error in the micro...

Please sign up or login with your details

Forgot password? Click here to reset