LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition

10/21/2020
by   Xie Chen, et al.
0

LSTM language models (LSTM-LMs) have been proven to be powerful and yielded significant performance improvements over count based n-gram LMs in modern speech recognition systems. Due to its infinite history states and computational load, most previous studies focus on applying LSTM-LMs in the second-pass for rescoring purpose. Recent work shows that it is feasible and computationally affordable to adopt the LSTM-LMs in the first-pass decoding within a dynamic (or tree based) decoder framework. In this work, the LSTM-LM is composed with a WFST decoder on-the-fly for the first-pass decoding. Furthermore, motivated by the long-term history nature of LSTM-LMs, the use of context beyond the current utterance is explored for the first-pass decoding in conversational speech recognition. The context information is captured by the hidden states of LSTM-LMs across utterance and can be used to guide the first-pass search effectively. The experimental results in our internal meeting transcription system show that significant performance improvements can be obtained by incorporating the contextual information with LSTM-LMs in the first-pass decoding, compared to applying the contextual information in the second-pass rescoring.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2023

Leveraging Cross-Utterance Context For ASR Decoding

While external language models (LMs) are often incorporated into the dec...
research
11/18/2020

Context-aware RNNLM Rescoring for Conversational Speech Recognition

Conversational speech recognition is regarded as a challenging task due ...
research
07/01/2019

LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring

LSTM based language models are an important part of modern LVCSR systems...
research
04/02/2020

Full-Sum Decoding for Hybrid HMM based Speech Recognition using LSTM Language Model

In hybrid HMM based speech recognition, LSTM language models have been w...
research
10/03/2017

Decoding visemes: improving machine lipreading

To undertake machine lip-reading, we try to recognise speech from a visu...
research
01/27/2021

Transformer Based Deliberation for Two-Pass Speech Recognition

Interactive speech recognition systems must generate words quickly while...
research
06/16/2021

On the long-term learning ability of LSTM LMs

We inspect the long-term learning ability of Long Short-Term Memory lang...

Please sign up or login with your details

Forgot password? Click here to reset