Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models

11/09/2022
by   Travis M. Bartley, et al.
0

In this paper, we extend previous self-supervised approaches for language identification by experimenting with Conformer based architecture in a multilingual pre-training paradigm. We find that pre-trained speech models optimally encode language discriminatory information in lower layers. Further, we demonstrate that the embeddings obtained from these layers are significantly robust to classify unseen languages and different acoustic environments without additional training. After fine-tuning a pre-trained Conformer model on the VoxLingua107 dataset, we achieve results similar to current state-of-the-art systems for language identification. More, our model accomplishes this with 5x less parameters. We open-source the model through the NVIDIA NeMo toolkit.

READ FULL TEXT
research
05/21/2023

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Recent models such as XLS-R and Whisper have made multilingual speech te...
research
05/19/2023

North Sámi Dialect Identification with Self-supervised Speech Models

The North Sámi (NS) language encapsulates four primary dialectal variant...
research
03/27/2023

Lexicon-Enhanced Self-Supervised Training for Multilingual Dense Retrieval

Recent multilingual pre-trained models have shown better performance in ...
research
10/28/2022

Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models

Given the strong results of self-supervised models on various tasks, the...
research
03/02/2023

Denoising-based UNMT is more robust to word-order divergence than MASS-based UNMT

We aim to investigate whether UNMT approaches with self-supervised pre-t...
research
06/03/2023

Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling

Acoustic word embeddings are typically created by training a pooling fun...
research
05/18/2023

A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model

In this work, we explore Parameter-Efficient-Learning (PEL) techniques t...

Please sign up or login with your details

Forgot password? Click here to reset