Generation and Pruning of Pronunciation Variants to Improve ASR Accuracy

06/28/2016
by   Zhenhao Ge, et al.
0

Speech recognition, especially name recognition, is widely used in phone services such as company directory dialers, stock quote providers or location finders. It is usually challenging due to pronunciation variations. This paper proposes an efficient and robust data-driven technique which automatically learns acceptable word pronunciations and updates the pronunciation dictionary to build a better lexicon without affecting recognition of other words similar to the target word. It generalizes well on datasets with various sizes, and reduces the error rate on a database with 13000+ human names by 42 to a baseline with regular dictionaries already covering canonical pronunciations of 97 spelling-to-pronunciation (STP) engine.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/27/2022

Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada

In this paper, we present specially designed automatic speech recognitio...
research
05/26/2021

Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition

Loanwords, such as Anglicisms, are a challenge in German speech recognit...
research
03/22/2017

Direct Acoustics-to-Word Models for English Conversational Speech Recognition

Recent work on end-to-end automatic speech recognition (ASR) has shown t...
research
09/14/2016

An Adaptive Psychoacoustic Model for Automatic Speech Recognition

Compared with automatic speech recognition (ASR), the human auditory sys...
research
06/07/2023

Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency

Word error rate (WER) and character error rate (CER) are standard metric...
research
03/09/2023

Unsupervised Language agnostic WER Standardization

Word error rate (WER) is a standard metric for the evaluation of Automat...
research
07/26/2021

Improving Word Recognition in Speech Transcriptions by Decision-level Fusion of Stemming and Two-way Phoneme Pruning

We introduce an unsupervised approach for correcting highly imperfect sp...

Please sign up or login with your details

Forgot password? Click here to reset