Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

05/18/2023
by   Heng-Jui Chang, et al.
0

Self-supervised speech representation models have succeeded in various tasks, but improving them for content-related problems using unlabeled data is challenging. We propose speaker-invariant clustering (Spin), a novel self-supervised learning method that clusters speech representations and performs swapped prediction between the original and speaker-perturbed utterances. Spin disentangles speaker information and preserves content representations with just 45 minutes of fine-tuning on a single GPU. Spin improves pre-trained networks and outperforms prior methods in speech recognition and acoustic unit discovery.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/27/2022

Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

Recently, self-supervised learning (SSL) has demonstrated strong perform...
research
09/07/2023

Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction

The choice of the objective function is crucial in emerging high-quality...
research
03/20/2023

Cocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture and Single-Source Speech

Self-supervised learning leverages unlabeled data effectively, improving...
research
12/16/2021

Self-Supervised Learning for speech recognition with Intermediate layer supervision

Recently, pioneer work finds that speech pre-trained models can solve fu...
research
11/14/2022

Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations

Children's speech recognition is a vital, yet largely overlooked domain ...
research
12/06/2022

Parameter Efficient Transfer Learning for Various Speech Processing Tasks

Fine-tuning of self-supervised models is a powerful transfer learning me...
research
06/01/2023

Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations

Self-Supervised Learning (SSL) has allowed leveraging large amounts of u...

Please sign up or login with your details

Forgot password? Click here to reset