Svarah: Evaluating English ASR Systems on Indian Accents

05/25/2023
by   Tahir Javed, et al.
0

India is the second largest English-speaking country in the world with a speaker base of roughly 130 million. Thus, it is imperative that automatic speech recognition (ASR) systems for English should be evaluated on Indian accents. Unfortunately, Indian speakers find a very poor representation in existing English ASR benchmarks such as LibriSpeech, Switchboard, Speech Accent Archive, etc. In this work, we address this gap by creating Svarah, a benchmark that contains 9.6 hours of transcribed English audio from 117 speakers across 65 geographic locations throughout India, resulting in a diverse range of accents. Svarah comprises both read speech and spontaneous conversational data, covering various domains, such as history, culture, tourism, etc., ensuring a diverse vocabulary. We evaluate 6 open source ASR models and 2 commercial ASR systems on Svarah and show that there is clear scope for improvement on Indian accents. Svarah as well as all our code will be publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2022

TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline

This paper introduces a new corpus of Mandarin-English code-switching sp...
research
08/01/2022

Performance Disparities Between Accents in Automatic Speech Recognition

Automatic speech recognition (ASR) services are ubiquitous, transforming...
research
05/24/2023

Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR

Improving ASR systems is necessary to make new LLM-based use-cases acces...
research
03/31/2023

The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR

English is the most widely spoken language in the world, used daily by m...
research
10/19/2021

AequeVox: Automated Fairness Testing of Speech Recognition Systems

Automatic Speech Recognition (ASR) systems have become ubiquitous. They ...
research
08/02/2019

A Speech Test Set of Practice Business Presentations with Additional Relevant Texts

We present a test corpus of audio recordings and transcriptions of prese...
research
07/20/2023

A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos

Automatic speech recognition (ASR) systems are designed to transcribe sp...

Please sign up or login with your details

Forgot password? Click here to reset