Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

10/10/2021
by   Matthew Wiesner, et al.
6

Self-supervised model pre-training has recently garnered significant interest, but relatively few efforts have explored using additional resources in fine-tuning these models. We demonstrate how universal phoneset acoustic models can leverage cross-lingual supervision to improve transfer of pretrained self-supervised representations to new languages. We also show how target-language text can be used to enable and improve fine-tuning with the lattice-free maximum mutual information (LF-MMI) objective. In three low-resource languages these techniques greatly improved few-shot learning performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2019

MultiFiT: Efficient Multi-lingual Language Model Fine-tuning

Pretrained language models are promising particularly for low-resource l...
research
03/14/2023

Learning Cross-lingual Visual Speech Representations

Cross-lingual self-supervised learning has been a growing research topic...
research
12/28/2020

Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models

In this work, we propose lattice-free MMI (LFMMI) for supervised adaptat...
research
06/27/2022

Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding

This paper studies a transferable phoneme embedding framework that aims ...
research
12/21/2022

Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning

Traditional approaches to RL have focused on learning decision policies ...
research
06/07/2023

GPT Self-Supervision for a Better Data Annotator

The task of annotating data into concise summaries poses a significant c...
research
04/06/2021

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model

In this work, we investigate if the wav2vec 2.0 self-supervised pretrain...

Please sign up or login with your details

Forgot password? Click here to reset