Acoustic data-driven lexicon learning based on a greedy pronunciation selection framework

06/12/2017
by   Xiaohui Zhang, et al.
0

Speech recognition systems for irregularly-spelled languages like English normally require hand-written pronunciations. In this paper, we describe a system for automatically obtaining pronunciations of words for which pronunciations are not available, but for which transcribed data exists. Our method integrates information from the letter sequence and from the acoustic evidence. The novel aspect of the problem that we address is the problem of how to prune entries from such a lexicon (since, empirically, lexicons with too many entries do not tend to be good for ASR performance). Experiments on various ASR tasks show that, with the proposed framework, starting with an initial lexicon of several thousand words, we are able to learn a lexicon which performs close to a full expert lexicon in terms of WER performance on test data, and is better than lexicons built using G2P alone or with a pruning criterion based on pronunciation probability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2019

G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR

Grapheme-based acoustic modeling has recently been shown to outperform p...
research
08/13/2020

LSTM Acoustic Models Learn to Align and Pronounce with Graphemes

Automated speech recognition coverage of the world's languages continues...
research
06/09/2023

A Theory of Unsupervised Speech Recognition

Unsupervised speech recognition (ASR-U) is the problem of learning autom...
research
07/27/2022

Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada

In this paper, we present specially designed automatic speech recognitio...
research
09/22/2019

Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR

Acoustic-to-word (A2W) end-to-end automatic speech recognition (ASR) sys...
research
07/09/2018

Foreign English Accent Adjustment by Learning Phonetic Patterns

State-of-the-art automatic speech recognition (ASR) systems struggle wit...

Please sign up or login with your details

Forgot password? Click here to reset