Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?

03/30/2022
by   Priyanshi Shah, et al.
0

We propose a new method for the calculation of error rates in Automatic Speech Recognition (ASR). This new metric is for languages that contain half characters and where the same character can be written in different forms. We implement our methodology in Hindi which is one of the main languages from Indic context and we think this approach is scalable to other similar languages containing a large character set. We call our metrics Alternate Word Error Rate (AWER) and Alternate Character Error Rate (ACER). We train our ASR models using wav2vec 2.0<cit.> for Indic languages. Additionally we use language models to improve our model performance. Our results show a significant improvement in analyzing the error rates at word and character level and the interpretability of the ASR system is improved upto 3% in AWER and 7% in ACER for Hindi. Our experiments suggest that in languages which have complex pronunciation, there are multiple ways of writing words without changing their meaning. In such cases AWER and ACER will be more useful rather than WER and CER as metrics. Furthermore, we open source a new benchmarking dataset of 21 hours for Hindi with the new metric scripts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2022

Improving Speech Recognition for Indic Languages using Language Model

We study the effect of applying a language model (LM) on the output of A...
research
06/07/2023

Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency

Word error rate (WER) and character error rate (CER) are standard metric...
research
09/21/2022

Assessing ASR Model Quality on Disordered Speech using BERTScore

Word Error Rate (WER) is the primary metric used to assess automatic spe...
research
02/26/2023

User-Centric Evaluation of OCR Systems for Kwak'wala

There has been recent interest in improving optical character recognitio...
research
03/09/2023

Unsupervised Language agnostic WER Standardization

Word error rate (WER) is a standard metric for the evaluation of Automat...
research
05/05/2020

Phonetic and Visual Priors for Decipherment of Informal Romanization

Informal romanization is an idiosyncratic process used by humans in info...
research
11/28/2018

On the Inductive Bias of Word-Character-Level Multi-Task Learning for Speech Recognition

End-to-end automatic speech recognition (ASR) commonly transcribes audio...

Please sign up or login with your details

Forgot password? Click here to reset