Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models

12/20/2022
by   Changli Tang, et al.
0

Self-supervised learning (SSL) has achieved great success in various areas including speech processing. Recently, it is proven that speech based SSL models are able to extract superior universal representations on a range of downstream tasks compared to traditional hand-craft feature (e.g. FBank, MFCC) in the SUPERB benchmark. However, different types of SSL models might exhibit distinct strengths on different downstream tasks. In order to better utilize the potential power of SSL models, in this work, we explore the effective fusion on multiple SSL models. A series of model fusion algorithms are investigated and compared by combining two types of SSL models, Hubert and Data2vec, on two representative tasks from SUPERB benchmark, which are speaker identification (SID) and automatic speech recognition (ASR) tasks. The experimental results demonstrate that our proposed fusion algorithms can further boost the individual model significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications

Large self-supervised pre-trained speech models have achieved remarkable...
research
06/30/2022

FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition

Self-supervised learning representations (SSLR) have resulted in robust ...
research
06/01/2023

Exploration on HuBERT with Multiple Resolutions

Hidden-unit BERT (HuBERT) is a widely-used self-supervised learning (SSL...
research
07/01/2021

Pretext Tasks selection for multitask self-supervised speech representation learning

Through solving pretext tasks, self-supervised learning leverages unlabe...
research
10/13/2022

On the Utility of Self-supervised Models for Prosody-related Tasks

Self-Supervised Learning (SSL) from speech data has produced models that...
research
11/17/2022

MelHuBERT: A simplified HuBERT on Mel spectrogram

Self-supervised models have had great success in learning speech represe...
research
05/18/2023

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

Speech processing Universal PERformance Benchmark (SUPERB) is a leaderbo...

Please sign up or login with your details

Forgot password? Click here to reset