Don't speak too fast: The impact of data bias on self-supervised speech models

10/15/2021
by   Yen Meng, et al.
0

Self-supervised Speech Models (S3Ms) have been proven successful in many speech downstream tasks, like ASR. However, how pre-training data affects S3Ms' downstream behavior remains an unexplored issue. In this paper, we study how pre-training data affects S3Ms by pre-training models on biased datasets targeting different factors of speech, including gender, content, and prosody, and evaluate these pre-trained S3Ms on selected downstream tasks in SUPERB Benchmark. Our experiments show that S3Ms have tolerance toward gender bias. Moreover, we find that the content of speech has little impact on the performance of S3Ms across downstream tasks, but S3Ms do show a preference toward a slower speech rate.

READ FULL TEXT
research
04/04/2022

A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems

Self-supervised models for speech processing emerged recently as popular...
research
11/08/2022

Comparative layer-wise analysis of self-supervised speech models

Many self-supervised speech models, varying in their pre-training object...
research
09/30/2022

Match to Win: Analysing Sequences Lengths for Efficient Self-supervised Learning in Speech and Audio

Self-supervised learning (SSL) has proven vital in speech and audio-rela...
research
09/02/2023

Studying the impacts of pre-training using ChatGPT-generated text on downstream tasks

In recent times, significant advancements have been witnessed in the fie...
research
03/13/2023

Analysing the Masked predictive coding training criterion for pre-training a Speech Representation Model

Recent developments in pre-trained speech representation utilizing self-...
research
05/24/2023

Delving Deeper into Data Scaling in Masked Image Modeling

Understanding whether self-supervised learning methods can scale with un...
research
10/29/2020

Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation

Speech separation has been well-developed while there are still problems...

Please sign up or login with your details

Forgot password? Click here to reset