Neural-FST Class Language Model for End-to-End Speech Recognition

01/28/2022
by   Antoine Bruguier, et al.
0

We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech recognition, a novel method that combines neural network language models (NNLMs) and finite state transducers (FSTs) in a mathematically consistent framework. Our method utilizes a background NNLM which models generic background text together with a collection of domain-specific entities modeled as individual FSTs. Each output token is generated by a mixture of these components; the mixture weights are estimated with a separately trained neural decider. We show that NFCLM significantly outperforms NNLM by 15.8 terms of Word Error Rate. NFCLM achieves similar performance as traditional NNLM and FST shallow fusion while being less prone to overbiasing and 12 times more compact, making it more suitable for on-device usage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2020

Multitask Training with Text Data for End-to-End Speech Recognition

We propose a multitask training method for attention-based end-to-end sp...
research
06/04/2021

Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition

Integrating external language models (LMs) into end-to-end (E2E) models ...
research
07/24/2017

Exploring Neural Transducers for End-to-End Speech Recognition

In this work, we perform an empirical comparison among the CTC, RNN-Tran...
research
02/16/2022

Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers

End-to-end speech recognition is a promising technology for enabling com...
research
09/02/2019

Phrase-Level Class based Language Model for Mandarin Smart Speaker Query Recognition

The success of speech assistants requires precise recognition of a numbe...
research
06/25/2022

TEVR: Improving Speech Recognition by Token Entropy Variance Reduction

This paper presents TEVR, a speech recognition model designed to minimiz...
research
09/19/2019

A Comparison of Hybrid and End-to-End Models for Syllable Recognition

This paper presents a comparison of a traditional hybrid speech recognit...

Please sign up or login with your details

Forgot password? Click here to reset