Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation

05/29/2023
by   Yui Sudo, et al.
0

End-to-end automatic speech recognition (E2E-ASR) has the potential to improve performance, but a specific issue that needs to be addressed is the difficulty it has in handling enharmonic words: named entities (NEs) with the same pronunciation and part of speech that are spelled differently. This often occurs with Japanese personal names that have the same pronunciation but different Kanji characters. Since such NE words tend to be important keywords, ASR easily loses user trust if it misrecognizes them. To solve these problems, this paper proposes a novel retraining-free customized method for E2E-ASRs based on a named-entity-aware E2E-ASR model and phoneme similarity estimation. Experimental results show that the proposed method improves the target NE character error rate by 35.7 model when selecting personal names as a target NE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2018

End-to-end named entity extraction from speech

Named entity recognition (NER) is among SLU tasks that usually extract s...
research
08/14/2023

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Accurate recognition of specific categories, such as persons' names, dat...
research
05/22/2023

CopyNE: Better Contextual ASR by Copying Named Entities

Recent years have seen remarkable progress in automatic speech recogniti...
research
07/10/2020

Class LM and word mapping for contextual biasing in End-to-End ASR

In recent years, all-neural, end-to-end (E2E) ASR systems gained rapid i...
research
10/22/2019

G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR

Grapheme-based acoustic modeling has recently been shown to outperform p...
research
10/21/2022

Named Entity Detection and Injection for Direct Speech Translation

In a sentence, certain words are critical for its semantic. Among them, ...
research
06/09/2023

Record Deduplication for Entity Distribution Modeling in ASR Transcripts

Voice digital assistants must keep up with trending search queries. We r...

Please sign up or login with your details

Forgot password? Click here to reset