Indonesian Automatic Speech Recognition with XLSR-53

08/20/2023
by   Panji Arisaputra, et al.
0

This study focuses on the development of Indonesian Automatic Speech Recognition (ASR) using the XLSR-53 pre-trained model, the XLSR stands for cross-lingual speech representations. The use of this XLSR-53 pre-trained model is to significantly reduce the amount of training data in non-English languages required to achieve a competitive Word Error Rate (WER). The total amount of data used in this study is 24 hours, 18 minutes, and 1 second: (1) TITML-IDN 14 hours and 31 minutes; (2) Magic Data 3 hours and 33 minutes; and (3) Common Voice 6 hours, 14 minutes, and 1 second. With a WER of 20 this study can compete with similar models using the Common Voice dataset split test. WER can be decreased by around 8 from 20 previous research in contributing to the creation of a better Indonesian ASR with a smaller amount of data.

READ FULL TEXT

page 3

page 4

page 5

page 7

research
12/13/2019

Common Voice: A Massively-Multilingual Speech Corpus

The Common Voice corpus is a massively-multilingual collection of transc...
research
03/08/2020

Development of Automatic Speech Recognition for Kazakh Language using Transfer Learning

Development of Automatic Speech Recognition system for Kazakh language i...
research
06/01/2023

Some voices are too common: Building fair speech recognition systems using the Common Voice dataset

Automatic speech recognition (ASR) systems become increasingly efficient...
research
04/25/2022

Speech Detection For Child-Clinician Conversations In Danish For Low-Resource In-The-Wild Conditions: A Case Study

Use of speech models for automatic speech processing tasks can improve e...
research
02/09/2023

Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions

Recent research using pre-trained transformer models suggests that just ...
research
07/27/2022

Subword Dictionary Learning and Segmentation Techniques for Automatic Speech Recognition in Tamil and Kannada

We present automatic speech recognition (ASR) systems for Tamil and Kann...
research
04/15/2022

Automated speech tools for helping communities process restricted-access corpora for language revival efforts

Many archival recordings of speech from endangered languages remain unan...

Please sign up or login with your details

Forgot password? Click here to reset