A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures

07/12/2018
by   Jan Vanek, et al.
0

Recently, recurrent neural networks have become state-of-the-art in acoustic modeling for automatic speech recognition. The long short-term memory (LSTM) units are the most popular ones. However, alternative units like gated recurrent unit (GRU) and its modifications outperformed LSTM in some publications. In this paper, we compared five neural network (NN) architectures with various adaptation and feature normalization techniques. We have evaluated feature-space maximum likelihood linear regression, five variants of i-vector adaptation and two variants of cepstral mean normalization. The most adaptation and normalization techniques were developed for feed-forward NNs and, according to results in this paper, not all of them worked also with RNNs. For experiments, we have chosen a well known and available TIMIT phone recognition task. The phone recognition is much more sensitive to the quality of AM than large vocabulary task with a complex language model. Also, we published the open-source scripts to easily replicate the results and to help continue the development.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/24/2015

Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition

We have recently shown that deep Long Short-Term Memory (LSTM) recurrent...
research
05/24/2017

Adaptive Detrending to Accelerate Convolutional Gated Recurrent Unit Training for Contextual Video Recognition

Based on the progress of image recognition, video recognition has been e...
research
06/19/2018

Recurrent DNNs and its Ensembles on the TIMIT Phone Recognition Task

In this paper, we have investigated recurrent deep neural networks (DNNs...
research
11/27/2018

Are 2D-LSTM really dead for offline text recognition?

There is a recent trend in handwritten text recognition with deep neural...
research
06/19/2018

A Survey of Recent DNN Architectures on the TIMIT Phone Recognition Task

In this survey paper, we have evaluated several recent deep neural netwo...
research
09/08/2017

Training RNNs as Fast as CNNs

Common recurrent neural network architectures scale poorly due to the in...
research
03/03/2017

Exponential Moving Average Model in Parallel Speech Recognition Training

As training data rapid growth, large-scale parallel training with multi-...

Please sign up or login with your details

Forgot password? Click here to reset