DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT

10/05/2021
by   Heng-Jui Chang, et al.
0

Self-supervised speech representation learning methods like wav2vec 2.0 and Hidden-unit BERT (HuBERT) leverage unlabeled speech data for pre-training and offer good representations for numerous speech processing tasks. Despite the success of these methods, they require large memory and high pre-training costs, making them inaccessible for researchers in academia and small companies. Therefore, this paper introduces DistilHuBERT, a novel multi-task learning framework to distill hidden representations from a HuBERT model directly. This method reduces HuBERT's size by 75 retaining most performance in ten different tasks. Moreover, DistilHuBERT required little training time and data, opening the possibilities of pre-training personal and on-device SSL models for speech.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2022

MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition

In this paper, we propose a novel multi-modal multi-task encoder-decoder...
research
08/07/2021

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training

Motivated by the success of masked language modeling (MLM) in pre-traini...
research
10/23/2019

Generative Pre-Training for Speech with Autoregressive Predictive Coding

Learning meaningful and general representations from unannotated speech ...
research
10/20/2021

SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training

Unsupervised pre-training is now the predominant approach for both text ...
research
11/14/2022

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets

In this paper, we provide a new perspective on self-supervised speech mo...
research
02/18/2023

RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness

Self-supervised speech pre-training enables deep neural network models t...
research
06/15/2021

Multivariate Business Process Representation Learning utilizing Gramian Angular Fields and Convolutional Neural Networks

Learning meaningful representations of data is an important aspect of ma...

Please sign up or login with your details

Forgot password? Click here to reset