Unsupervised Automatic Speech Recognition: A Review

06/09/2021
by   Hanan Aldarmaki, et al.
0

Automatic Speech Recognition (ASR) systems can be trained to achieve remarkable performance given large amounts of manually transcribed speech, but large labeled data sets can be difficult or expensive to acquire for all languages of interest. In this paper, we review the research literature to identify models and ideas that could lead to fully unsupervised ASR, including unsupervised segmentation of the speech signal, unsupervised mapping from speech segments to text, and semi-supervised models with nominal amounts of labeled examples. The objective of the study is to identify the limitations of what can be learned from speech data alone and to understand the minimum requirements for speech recognition. Identifying these limitations would help optimize the resources and efforts in ASR development for low-resource languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2023

Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation

The performance of automatic speech recognition (ASR) systems has advanc...
research
06/09/2023

A Theory of Unsupervised Speech Recognition

Unsupervised speech recognition (ASR-U) is the problem of learning autom...
research
03/16/2021

Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning

This paper describes the results of an informal collaboration launched d...
research
04/29/2019

A Comparison of Online Automatic Speech Recognition Systems and the Nonverbal Responses to Unintelligible Speech

Automatic Speech Recognition (ASR) systems have proliferated over the re...
research
05/12/2023

Investigating the Sensitivity of Automatic Speech Recognition Systems to Phonetic Variation in L2 Englishes

Automatic Speech Recognition (ASR) systems exhibit the best performance ...
research
09/30/2021

SpliceOut: A Simple and Efficient Audio Augmentation Method

Time masking has become a de facto augmentation technique for speech and...
research
06/19/2020

Efficient Active Learning for Automatic Speech Recognition via Augmented Consistency Regularization

The cost of labeling transcriptions for large speech corpora becomes a b...

Please sign up or login with your details

Forgot password? Click here to reset