What shall we do with an hour of data? Speech recognition for the un- and under-served languages of Common Voice

05/10/2021
by   Francis M. Tyers, et al.
0

This technical report describes the methods and results of a three-week sprint to produce deployable speech recognition models for 31 under-served languages of the Common Voice project. We outline the preprocessing steps, hyperparameter selection, and resulting accuracy on official testing sets. In addition to this we evaluate the models on multiple tasks: closed-vocabulary speech recognition, pre-transcription, forced alignment, and key-word spotting. The following experiments use Coqui STT, a toolkit for training and deployment of neural Speech-to-Text models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/13/2019

Common Voice: A Massively-Multilingual Speech Corpus

The Common Voice corpus is a massively-multilingual collection of transc...
research
02/17/2023

From User Perceptions to Technical Improvement: Enabling People Who Stutter to Better Use Speech Recognition

Consumer speech recognition systems do not work as well for many people ...
research
09/02/2020

Convolutional Speech Recognition with Pitch and Voice Quality Features

The effects of adding pitch and voice quality features such as jitter an...
research
03/01/2023

WhisperX: Time-Accurate Speech Transcription of Long-Form Audio

Large-scale, weakly-supervised speech recognition models, such as Whispe...
research
06/18/2021

Analysis and Tuning of a Voice Assistant System for Dysfluent Speech

Dysfluencies and variations in speech pronunciation can severely degrade...
research
07/30/2021

The History of Speech Recognition to the Year 2030

The decade from 2010 to 2020 saw remarkable improvements in automatic sp...
research
09/01/2023

Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper

This article introduces Mi-Go, a novel testing framework aimed at evalua...

Please sign up or login with your details

Forgot password? Click here to reset