Analysing the Masked predictive coding training criterion for pre-training a Speech Representation Model

03/13/2023
by   Hemant Yadav, et al.
0

Recent developments in pre-trained speech representation utilizing self-supervised learning (SSL) have yielded exceptional results on a variety of downstream tasks. One such technique, known as masked predictive coding (MPC), has been employed by some of the most high-performing models. In this study, we investigate the impact of MPC loss on the type of information learnt at various layers in the HuBERT model, using nine probing tasks. Our findings indicate that the amount of content information learned at various layers of the HuBERT model has a positive correlation to the MPC loss. Additionally, it is also observed that any speaker-related information learned at intermediate layers of the model, is an indirect consequence of the learning process, and therefore cannot be controlled using the MPC loss. These findings may serve as inspiration for further research in the speech community, specifically in the development of new pre-training tasks or the exploration of new pre-training criterion's that directly preserves both speaker and content information at various layers of a learnt model.

READ FULL TEXT
research
05/20/2020

A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition

Building a good speech recognition system usually requires large amounts...
research
10/15/2021

Don't speak too fast: The impact of data bias on self-supervised speech models

Self-supervised Speech Models (S3Ms) have been proven successful in many...
research
11/08/2022

Comparative layer-wise analysis of self-supervised speech models

Many self-supervised speech models, varying in their pre-training object...
research
07/10/2021

Layer-wise Analysis of a Self-supervised Speech Representation Model

Recently proposed self-supervised learning approaches have been successf...
research
06/03/2021

MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding

Recently, various neural models for multi-party conversation (MPC) have ...
research
10/08/2021

A study on the efficacy of model pre-training in developing neural text-to-speech system

In the development of neural text-to-speech systems, model pre-training ...
research
10/28/2022

The Importance of Speech Stimuli for Pathologic Speech Classification

Current findings show that pre-trained wav2vec 2.0 models can be success...

Please sign up or login with your details

Forgot password? Click here to reset