Full-Sum Decoding for Hybrid HMM based Speech Recognition using LSTM Language Model

04/02/2020
by   Wei Zhou, et al.
0

In hybrid HMM based speech recognition, LSTM language models have been widely applied and achieved large improvements. The theoretical capability of modeling any unlimited context suggests that no recombination should be applied in decoding. This motivates to reconsider full summation over the HMM-state sequences instead of Viterbi approximation in decoding. We explore the potential gain from more accurate probabilities in terms of decision making and apply the full-sum decoding with a modified prefix-tree search framework. The proposed full-sum decoding is evaluated on both Switchboard and Librispeech corpora. Different models using CE and sMBR training criteria are used. Additionally, both MAP and confusion network decoding as approximated variants of general Bayes decision rule are evaluated. Consistent improvements over strong baselines are achieved in almost all cases without extra cost. We also discuss tuning effort, efficiency and some limitations of full-sum decoding.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2018

Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model

In this paper we proposed a novel Adversarial Training (AT) approach for...
research
04/09/2019

Who Needs Words? Lexicon-Free Speech Recognition

Lexicon-free speech recognition naturally deals with the problem of out-...
research
10/21/2020

LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition

LSTM language models (LSTM-LMs) have been proven to be powerful and yiel...
research
01/04/2020

Transformer-based language modeling and decoding for conversational speech recognition

We propose a way to use a transformer-based language model in conversati...
research
08/12/2014

First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs

We present a method to perform first-pass large vocabulary continuous sp...
research
07/01/2019

LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring

LSTM based language models are an important part of modern LVCSR systems...
research
10/23/2019

Efficient Dynamic WFST Decoding for Personalized Language Models

We propose a two-layer cache mechanism to speed up dynamic WFST decoding...

Please sign up or login with your details

Forgot password? Click here to reset