Towards End-to-end Unsupervised Speech Recognition

04/05/2022
by   Alexander H. Liu, et al.
0

Unsupervised speech recognition has shown great potential to make Automatic Speech Recognition (ASR) systems accessible to every language. However, existing methods still heavily rely on hand-crafted pre-processing. Similar to the trend of making supervised speech recognition end-to-end, we introduce  which does away with all audio-side pre-processing and improves accuracy through better architecture. In addition, we introduce an auxiliary self-supervised objective that ties model predictions back to the input. Experiments show that  improves unsupervised recognition results across different languages while being conceptually simpler.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2022

Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning

Almost none of the 2,000+ languages spoken in Africa have widely availab...
research
11/16/2022

L2 proficiency assessment using self-supervised speech representations

There has been a growing demand for automated spoken language assessment...
research
12/21/2022

End-to-End Automatic Speech Recognition model for the Sudanese Dialect

Designing a natural voice interface rely mostly on Speech recognition fo...
research
02/10/2023

AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations

Self-supervision has shown great potential for audio-visual speech recog...
research
10/24/2022

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition

Speech recognition applications cover a range of different audio and tex...
research
01/15/2019

AI Pipeline - bringing AI to you. End-to-end integration of data, algorithms and deployment tools

Next generation of embedded Information and Communication Technology (IC...
research
10/08/2021

SCaLa: Supervised Contrastive Learning for End-to-End Automatic Speech Recognition

End-to-end Automatic Speech Recognition (ASR) models are usually trained...

Please sign up or login with your details

Forgot password? Click here to reset