Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

06/01/2020
by   Sanket Shah, et al.
0

Recently, there has been significant progress made in Automatic Speech Recognition (ASR) of code-switched speech, leading to gains in accuracy on code-switched datasets in many language pairs. Code-switched speech co-occurs with monolingual speech in one or both languages being mixed. In this work, we show that fine-tuning ASR models on code-switched speech harms performance on monolingual speech. We point out the need to optimize models for code-switching while also ensuring that monolingual performance is not sacrificed. Monolingual models may be trained on thousands of hours of speech which may not be available for re-training a new model. We propose using the Learning Without Forgetting (LWF) framework for code-switched ASR when we only have access to a monolingual model and do not have the data it was trained on. We show that it is possible to train models using this framework that perform well on both code-switched and monolingual test sets. In cases where we have access to monolingual training data as well, we propose regularization strategies for fine-tuning models for code-switching without sacrificing monolingual accuracy. We report improvements in Word Error Rate (WER) in monolingual and code-switched test sets compared to baselines that use pooled data and simple fine-tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2020

Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition

Recognizing code-switched speech is challenging for Automatic Speech Rec...
research
06/15/2022

Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech

In this paper, we present our progress in pretraining Czech monolingual ...
research
08/11/2023

Bilingual Streaming ASR with Grapheme units and Auxiliary Monolingual Loss

We introduce a bilingual solution to support English as secondary locale...
research
05/14/2018

Parser Training with Heterogeneous Treebanks

How to make the most of multiple heterogeneous treebanks when training a...
research
04/08/2019

Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data

The lack of code-switch training data is one of the major concerns in th...
research
05/06/2022

Hearing voices at the National Library – a speech corpus and acoustic model for the Swedish language

This paper explains our work in developing new acoustic models for autom...
research
06/14/2021

Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition

Modeling code-switched speech is an important problem in automatic speec...

Please sign up or login with your details

Forgot password? Click here to reset