Space-Efficient Representation of Entity-centric Query Language Models

06/29/2022
by   Christophe Van Gysel, et al.
0

Virtual assistants make use of automatic speech recognition (ASR) to help users answer entity-centric queries. However, spoken entity recognition is a difficult problem, due to the large number of frequently-changing named entities. In addition, resources available for recognition are constrained when ASR is performed on-device. In this work, we investigate the use of probabilistic grammars as language models within the finite-state transducer (FST) framework. We introduce a deterministic approximation to probabilistic grammars that avoids the explicit expansion of non-terminals at model creation time, integrates directly with the FST framework, and is complementary to n-gram models. We obtain a 10 queries compared to when a similarly-sized n-gram model is used without our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2020

Predicting Entity Popularity to Improve Spoken Entity Recognition by Virtual Assistants

We focus on improving the effectiveness of a Virtual Assistant (VA) in r...
research
07/02/2019

Scalable Multi Corpora Neural Language Models for ASR

Neural language models (NLM) have been shown to outperform conventional ...
research
05/21/2019

Approximating probabilistic models as weighted finite automata

Weighted finite automata (WFA) are often used to represent probabilistic...
research
06/21/2021

A Discriminative Entity-Aware Language Model for Virtual Assistants

High-quality automatic speech recognition (ASR) is essential for virtual...
research
06/09/2023

Record Deduplication for Entity Distribution Modeling in ASR Transcripts

Voice digital assistants must keep up with trending search queries. We r...
research
06/12/2023

On the N-gram Approximation of Pre-trained Language Models

Large pre-trained language models (PLMs) have shown remarkable performan...

Please sign up or login with your details

Forgot password? Click here to reset