Learning discriminative features in sequence training without requiring framewise labelled data

05/16/2019
by   Jun Wang, et al.
0

In this work, we try to answer two questions: Can deeply learned features with discriminative power benefit an ASR system's robustness to acoustic variability? And how to learn them without requiring framewise labelled sequence training data? As existing methods usually require knowing where the labels occur in the input sequence, they have so far been limited to many real-world sequence learning tasks. We propose a novel method which simultaneously models both the sequence discriminative training and the feature discriminative learning within a single network architecture, so that it can learn discriminative deep features in sequence training that obviates the need for presegmented training data. Our experiment in a realistic industrial ASR task shows that, without requiring any specific fine-tuning or additional complexity, our proposed models have consistently outperformed state-of-the-art models and significantly reduced Word Error Rate (WER) under all test conditions, and especially with highest improvements under unseen noise conditions, by relative 12.94 can generalize better to acoustic variability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2019

An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription

Despite the strong modeling power of neural network acoustic models, spe...
research
03/31/2022

How Does Pre-trained Wav2Vec2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications

Recent work on self-supervised pre-training focus on leveraging large-sc...
research
12/05/2017

Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models

Sequence-to-sequence models, such as attention-based models in automatic...
research
11/08/2018

A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-Trained Neural Network Acoustic Models

In this work, three lattice-free (LF) discriminative training criteria f...
research
10/03/2022

Efficient acoustic feature transformation in mismatched environments using a Guided-GAN

We propose a new framework to improve automatic speech recognition (ASR)...
research
08/02/2018

Sequence Discriminative Training for Deep Learning based Acoustic Keyword Spotting

Speech recognition is a sequence prediction problem. Besides employing v...
research
04/06/2018

Sequence Training of DNN Acoustic Models With Natural Gradient

Deep Neural Network (DNN) acoustic models often use discriminative seque...

Please sign up or login with your details

Forgot password? Click here to reset