Conditional independence for pretext task selection in Self-supervised speech representation learning

04/15/2021
by   Salah Zaiem, et al.
6

Through solving pretext tasks, self-supervised learning (SSL) leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task. A common pretext task consists in pretraining a SSL model on pseudo-labels derived from the original signal. This technique is particularly relevant for speech data where various meaningful signal processing features may serve as pseudo-labels. However, the process of selecting pseudo-labels, for speech or other types of data, remains mostly unexplored and currently relies on observing the results on the final downstream task. Nevertheless, this methodology is not sustainable at scale due to substantial computational (hence carbon) costs. Thus, this paper introduces a practical and theoretical framework to select relevant pseudo-labels with respect to a given downstream task. More precisely, we propose a functional estimator of the pseudo-label utility grounded in the conditional independence theory, which does not require any training. The experiments conducted on speaker recognition and automatic speech recognition validate our estimator, showing a significant correlation between the performance observed on the downstream task and the utility estimates obtained with our approach, facilitating the prospection of relevant pseudo-labels for self-supervised speech representation learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

Pretext Tasks selection for multitask self-supervised speech representation learning

Through solving pretext tasks, self-supervised learning leverages unlabe...
research
04/08/2022

Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning

Contrastive learning enables learning useful audio and speech representa...
research
06/07/2023

Label Aware Speech Representation Learning For Language Identification

Speech representation learning approaches for non-semantic tasks such as...
research
08/03/2020

Predicting What You Already Know Helps: Provable Self-Supervised Learning

Self-supervised representation learning solves auxiliary prediction task...
research
08/28/2023

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

Self-supervised learning (SSL) leverages large datasets of unlabeled spe...
research
10/13/2022

On the Utility of Self-supervised Models for Prosody-related Tasks

Self-Supervised Learning (SSL) from speech data has produced models that...
research
11/29/2022

Model Extraction Attack against Self-supervised Speech Models

Self-supervised learning (SSL) speech models generate meaningful represe...

Please sign up or login with your details

Forgot password? Click here to reset