Phrase-Level Class based Language Model for Mandarin Smart Speaker Query Recognition

09/02/2019
by   Yiheng Huang, et al.
0

The success of speech assistants requires precise recognition of a number of entities on particular contexts. A common solution is to train a class-based n-gram language model and then expand the classes into specific words or phrases. However, when the class has a huge list, e.g., more than 20 million songs, a fully expansion will cause memory explosion. Worse still, the list items in the class need to be updated frequently, which requires a dynamic model updating technique. In this work, we propose to train pruned language models for the word classes to replace the slots in the root n-gram. We further propose to use a novel technique, named Difference Language Model (DLM), to correct the bias from the pruned language models. Once the decoding graph is built, we only need to recalculate the DLM when the entities in word classes are updated. Results show that the proposed method consistently and significantly outperforms the conventional approaches on all datasets, esp. for large lists, which the conventional approaches cannot handle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2023

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

In spite of the excellent strides made by end-to-end (E2E) models in spe...
research
06/23/2017

Comparison of Modified Kneser-Ney and Witten-Bell Smoothing Techniques in Statistical Language Model of Bahasa Indonesia

Smoothing is one technique to overcome data sparsity in statistical lang...
research
01/21/2015

Phrase Based Language Model for Statistical Machine Translation: Empirical Study

Reordering is a challenge to machine translation (MT) systems. In MT, th...
research
01/28/2022

Neural-FST Class Language Model for End-to-End Speech Recognition

We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech...
research
05/16/2019

Effective Sentence Scoring Method using Bidirectional Language Model for Speech Recognition

In automatic speech recognition, many studies have shown performance imp...
research
05/26/2023

External Language Model Integration for Factorized Neural Transducers

We propose an adaptation method for factorized neural transducers (FNT) ...
research
04/20/2018

Lightweight Adaptive Mixture of Neural and N-gram Language Models

It is often the case that the best performing language model is an ensem...

Please sign up or login with your details

Forgot password? Click here to reset