Layer-wise Analysis of a Self-supervised Speech Representation Model

07/10/2021
by   Ankita Pasad, et al.
4

Recently proposed self-supervised learning approaches have been successful for pre-training speech representation models. The utility of these learned representations has been observed empirically, but not much has been studied about the type or extent of information encoded in the pre-trained representations themselves. Developing such insights can help understand the capabilities and limits of these models and enable the research community to more efficiently develop their usage for downstream applications. In this work, we begin to fill this gap by examining one recent and successful pre-trained model (wav2vec 2.0), via its intermediate representation vectors, using a suite of analysis tools. We use the metrics of canonical correlation, mutual information, and performance on simple downstream tasks with non-parametric probes, in order to (i) query for acoustic and linguistic information content, (ii) characterize the evolution of information across model layers, and (iii) understand how fine-tuning the model for automatic speech recognition (ASR) affects these observations. Our findings motivate modifying the fine-tuning protocol for ASR, which produces improved word error rates in a low-resource setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2023

Self-supervised representations in speech-based depression detection

This paper proposes handling training data sparsity in speech-based auto...
research
04/04/2022

A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems

Self-supervised models for speech processing emerged recently as popular...
research
11/08/2022

Comparative layer-wise analysis of self-supervised speech models

Many self-supervised speech models, varying in their pre-training object...
research
03/12/2023

Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study

Self-supervised learning (SSL) has allowed substantial progress in Autom...
research
03/13/2023

Analysing the Masked predictive coding training criterion for pre-training a Speech Representation Model

Recent developments in pre-trained speech representation utilizing self-...
research
09/18/2023

Are Soft Prompts Good Zero-shot Learners for Speech Recognition?

Large self-supervised pre-trained speech models require computationally ...
research
12/16/2022

Context-aware Fine-tuning of Self-supervised Speech Models

Self-supervised pre-trained transformers have improved the state of the ...

Please sign up or login with your details

Forgot password? Click here to reset