DeepAI AI Chat
Log In Sign Up

Improved Language Identification Through Cross-Lingual Self-Supervised Learning

by   Andros Tjandra, et al.

Language identification greatly impacts the success of downstream tasks such as automatic speech recognition. Recently, self-supervised speech representations learned by wav2vec 2.0 have been shown to be very effective for a range of speech tasks. We extend previous self-supervised work on language identification by experimenting with pre-trained models which were learned on real-world unconstrained speech in multiple languages and not just on English. We show that models pre-trained on many languages perform better and enable language identification systems that require very little labeled data to perform well. Results on a 25 languages setup show that with only 10 minutes of labeled data per language, a cross-lingually pre-trained model can achieve over 93


Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition

Self-supervised learning (SSL) is a powerful tool that allows learning o...

Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models

Labeled audio data is insufficient to build satisfying speech recognitio...

Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification

The speech representations learned from large-scale unlabeled data have ...

Self-Supervised Representations Improve End-to-End Speech Translation

End-to-end speech-to-text translation can provide a simpler and smaller ...

Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models

In this paper, we extend previous self-supervised approaches for languag...

IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages

A cornerstone in AI research has been the creation and adoption of stand...