Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

08/28/2023
by   Salah Zaiem, et al.
0

Self-supervised learning (SSL) leverages large datasets of unlabeled speech to reach impressive performance with reduced amounts of annotated data. The high number of proposed approaches fostered the emergence of comprehensive benchmarks that evaluate their performance on a set of downstream tasks exploring various aspects of the speech signal. However, while the number of considered tasks has been growing, most proposals rely upon a single downstream architecture that maps the frozen SSL representations to the task labels. This study examines how benchmarking results are affected by changes in the probing head architecture. Interestingly, we found that altering the downstream architecture structure leads to significant fluctuations in the performance ranking of the evaluated models. Against common practices in speech SSL benchmarking, we evaluate larger-capacity probing heads, showing their impact on performance, inference costs, generalization and multi-level feature exploitation.

READ FULL TEXT

page 8

page 10

research
06/01/2023

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

Self-supervised learning (SSL) has recently allowed leveraging large dat...
research
09/07/2023

Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction

The choice of the objective function is crucial in emerging high-quality...
research
11/08/2021

Characterizing the adversarial vulnerability of speech self-supervised learning

A leaderboard named Speech processing Universal PERformance Benchmark (S...
research
07/01/2021

Pretext Tasks selection for multitask self-supervised speech representation learning

Through solving pretext tasks, self-supervised learning leverages unlabe...
research
11/29/2022

Model Extraction Attack against Self-supervised Speech Models

Self-supervised learning (SSL) speech models generate meaningful represe...
research
04/15/2021

Conditional independence for pretext task selection in Self-supervised speech representation learning

Through solving pretext tasks, self-supervised learning (SSL) leverages ...
research
11/17/2022

MelHuBERT: A simplified HuBERT on Mel spectrogram

Self-supervised models have had great success in learning speech represe...

Please sign up or login with your details

Forgot password? Click here to reset