WavFT: Acoustic model finetuning with labelled and unlabelled data

04/01/2022
by   Utkarsh Chauhan, et al.
0

Unsupervised and self-supervised learning methods have leveraged unlabelled data to improve the pretrained models. However, these methods need significantly large amount of unlabelled data and the computational cost of training models with such large amount of data can be prohibitively high. We address this issue by using unlabelled data during finetuning, instead of pretraining. We propose acoustic model finetuning (FT) using labelled and unlabelled data. The model is jointly trained to learn representations to classify senones, as well as learn contextual acoustic representations. Our training objective is a combination of cross entropy loss, suitable for classification task, and contrastive loss, suitable to learn acoustic representations. The proposed approach outperforms conventional finetuning with 11.2 Bengali languages respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2021

Rethinking Self-Supervised Learning: Small is Beautiful

Self-supervised learning (SSL), in particular contrastive learning, has ...
research
09/05/2022

Supervised Contrastive Learning to Classify Paranasal Anomalies in the Maxillary Sinus

Using deep learning techniques, anomalies in the paranasal sinus system ...
research
07/31/2022

COCOA: Cross Modality Contrastive Learning for Sensor Data

Self-Supervised Learning (SSL) is a new paradigm for learning discrimina...
research
07/23/2023

Self-Supervised Learning for Audio-Based Emotion Recognition

Emotion recognition models using audio input data can enable the develop...
research
06/16/2021

LiRA: Learning Visual Speech Representations from Audio through Self-supervision

The large amount of audiovisual content being shared online today has dr...
research
09/08/2023

End-to-End Speech Recognition and Disfluency Removal with Acoustic Language Model Pretraining

The SOTA in transcription of disfluent and conversational speech has in ...
research
06/15/2020

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction

To learn intrinsic low-dimensional structures from high-dimensional data...

Please sign up or login with your details

Forgot password? Click here to reset