Label Aware Speech Representation Learning For Language Identification

06/07/2023
by   Shikhar Vashishth, et al.
0

Speech representation learning approaches for non-semantic tasks such as language recognition have either explored supervised embedding extraction methods using a classifier model or self-supervised representation learning approaches using raw data. In this paper, we propose a novel framework of combining self-supervised representation learning with the language label information for the pre-training task. This framework, termed as Label Aware Speech Representation (LASR) learning, uses a triplet based objective function to incorporate language labels along with the self-supervised loss function. The speech representations are further fine-tuned for the downstream task. The language recognition experiments are performed on two public datasets - FLEURS and Dhwani. In these experiments, we illustrate that the proposed LASR framework improves over the state-of-the-art systems on language identification. We also report an analysis of the robustness of LASR approach to noisy/missing labels as well as its application to multi-lingual speech recognition tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2021

Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks

Representation learning from unlabeled data has been of major interest i...
research
07/20/2023

MASR: Metadata Aware Speech Representation

In the recent years, speech representation learning is constructed prima...
research
02/03/2021

General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework

This paper presents a self-supervised learning framework, named MGF, for...
research
06/15/2021

Multivariate Business Process Representation Learning utilizing Gramian Angular Fields and Convolutional Neural Networks

Learning meaningful representations of data is an important aspect of ma...
research
12/11/2020

Exploring wav2vec 2.0 on speaker verification and language identification

Wav2vec 2.0 is a recently proposed self-supervised framework for speech ...
research
03/01/2022

Towards a Common Speech Analysis Engine

Recent innovations in self-supervised representation learning have led t...
research
04/15/2021

Conditional independence for pretext task selection in Self-supervised speech representation learning

Through solving pretext tasks, self-supervised learning (SSL) leverages ...

Please sign up or login with your details

Forgot password? Click here to reset