WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition

09/21/2017
by   Ahmed Ali, et al.
0

We study the problem of evaluating automatic speech recognition (ASR) systems that target dialectal speech input. A major challenge in this case is that the orthography of dialects is typically not standardized. From an ASR evaluation perspective, this means that there is no clear gold standard for the expected output, and several possible outputs could be considered correct according to different human annotators, which makes standard word error rate (WER) inadequate as an evaluation metric. Such a situation is typical for machine translation (MT), and thus we borrow ideas from an MT evaluation metric, namely TERp, an extension of translation error rate which is closely-related to WER. In particular, in the process of comparing a hypothesis to a reference, we make use of spelling variants for words and phrases, which we mine from Twitter in an unsupervised fashion. Our experiments with evaluating ASR output for Egyptian Arabic, and further manual analysis, show that the resulting WERd (i.e., WER for dialects) metric, a variant of TERp, is more adequate than WER for evaluating dialectal ASR.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2023

Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency

Word error rate (WER) and character error rate (CER) are standard metric...
research
03/09/2023

Unsupervised Language agnostic WER Standardization

Word error rate (WER) is a standard metric for the evaluation of Automat...
research
06/27/2016

Evaluating Informal-Domain Word Representations With UrbanDictionary

Existing corpora for intrinsic evaluation are not targeted towards tasks...
research
09/25/2019

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

In a pipeline speech translation system, automatic speech recognition (A...
research
10/11/2021

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric

Measuring automatic speech recognition (ASR) system quality is critical ...
research
10/11/2022

Streaming Punctuation for Long-form Dictation with Transformers

While speech recognition Word Error Rate (WER) has reached human parity ...
research
11/24/2015

Spoken Language Translation for Polish

Spoken language translation (SLT) is becoming more important in the incr...

Please sign up or login with your details

Forgot password? Click here to reset