Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency

06/07/2023
by   Shigeki Karita, et al.
0

Word error rate (WER) and character error rate (CER) are standard metrics in Speech Recognition (ASR), but one problem has always been alternative spellings: If one's system transcribes adviser whereas the ground truth has advisor, this will count as an error even though the two spellings really represent the same word. Japanese is notorious for “lacking orthography”: most words can be spelled in multiple ways, presenting a problem for accurate ASR evaluation. In this paper we propose a new lenient evaluation metric as a more defensible CER measure for Japanese ASR. We create a lattice of plausible respellings of the reference transcription, using a combination of lexical resources, a Japanese text-processing system, and a neural machine translation model for reconstructing kanji from hiragana or katakana. In a manual evaluation, raters rated 95.4 that our method, which does not penalize the system for choosing a valid alternate spelling of a word, affords a 2.4 depending on the task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2022

Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?

We propose a new method for the calculation of error rates in Automatic ...
research
09/21/2017

WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition

We study the problem of evaluating automatic speech recognition (ASR) sy...
research
03/09/2023

Unsupervised Language agnostic WER Standardization

Word error rate (WER) is a standard metric for the evaluation of Automat...
research
06/13/2022

Toward Zero Oracle Word Error Rate on the Switchboard Benchmark

The "Switchboard benchmark" is a very well-known test set in automatic s...
research
07/22/2019

On Modeling ASR Word Confidence

We present a new method for computing ASR word confidences that effectiv...
research
01/14/2021

WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm

Automatic Speech Recognition (ASR) systems are evaluated using Word Erro...
research
06/28/2016

Generation and Pruning of Pronunciation Variants to Improve ASR Accuracy

Speech recognition, especially name recognition, is widely used in phone...

Please sign up or login with your details

Forgot password? Click here to reset