Efficient MDI Adaptation for n-gram Language Models

08/05/2020
by   Ruizhe Huang, et al.
0

This paper presents an efficient algorithm for n-gram language model adaptation under the minimum discrimination information (MDI) principle, where an out-of-domain language model is adapted to satisfy the constraints of marginal probabilities of the in-domain data. The challenge for MDI language model adaptation is its computational complexity. By taking advantage of the backoff structure of n-gram model and the idea of hierarchical training method, originally proposed for maximum entropy (ME) language models, we show that MDI adaptation can be computed in linear-time complexity to the inputs in each iteration. The complexity remains the same as ME models, although MDI is more general than ME. This makes MDI adaptation practical for large corpus and vocabulary. Experimental results confirm the scalability of our algorithm on very large datasets, while MDI adaptation gets slightly worse perplexity but better word error rate results compared to simple linear interpolation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2022

Fast and accurate factorized neural transducer for text adaption of end-to-end speech recognition models

Neural transducer is now the most popular end-to-end model for speech re...
research
05/26/2023

External Language Model Integration for Factorized Neural Transducers

We propose an adaptation method for factorized neural transducers (FNT) ...
research
09/21/2021

The Trade-offs of Domain Adaptation for Neural Language Models

In this paper, we connect language model adaptation with concepts of mac...
research
10/26/2022

Residual Learning of Neural Text Generation with n-gram Language Model

N-gram language models (LM) have been largely superseded by neural LMs a...
research
12/11/2018

Scalable language model adaptation for spoken dialogue systems

Language models (LM) for interactive speech recognition systems are trai...
research
08/25/2022

Training a T5 Using Lab-sized Resources

Training large neural language models on large datasets is resource- and...
research
09/06/2021

You should evaluate your language model on marginal likelihood overtokenisations

Neural language models typically tokenise input text into sub-word units...

Please sign up or login with your details

Forgot password? Click here to reset