Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition

07/10/2019
by   Khoi-Nguyen C. Mac, et al.
0

In automatic speech recognition (ASR), wideband (WB) and narrowband (NB) speech signals with different sampling rates typically use separate acoustic models. Therefore mixed-bandwidth (MB) acoustic modeling has important practical values for ASR system deployment. In this paper, we extensively investigate large-scale MB deep neural network acoustic modeling for ASR using 1,150 hours of WB data and 2,300 hours of NB data. We study various MB strategies including downsampling, upsampling and bandwidth extension for MB acoustic modeling and evaluate their performance on 8 diverse WB and NB test sets from various application domains. To deal with the large amounts of training data, distributed training is carried out on multiple GPUs using synchronous data parallelism.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2020

Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition

The past decade has witnessed great progress in Automatic Speech Recogni...
research
12/12/2019

On Neural Phone Recognition of Mixed-Source ECoG Signals

The emerging field of neural speech recognition (NSR) using electrocorti...
research
03/25/2022

Impact of Dataset on Acoustic Models for Automatic Speech Recognition

In Automatic Speech Recognition, GMM-HMM had been widely used for acoust...
research
05/25/2022

An Investigation on Applying Acoustic Feature Conversion to ASR of Adult and Child Speech

The performance of child speech recognition is generally less satisfacto...
research
09/05/2019

Bandwidth Embeddings for Mixed-bandwidth Speech Recognition

In this paper, we tackle the problem of handling narrowband and wideband...
research
12/11/2019

SpecAugment on Large Scale Datasets

Recently, SpecAugment, an augmentation scheme for automatic speech recog...
research
11/30/2022

Preliminary Study on SSCF-derived Polar Coordinate for ASR

The transition angles are defined to describe the vowel-to-vowel transit...

Please sign up or login with your details

Forgot password? Click here to reset