Code-Switching Detection with Data-Augmented Acoustic and Language Models

07/28/2018
by   Emre Yilmaz, et al.
0

In this paper, we investigate the code-switching detection performance of a code-switching (CS) automatic speech recognition (ASR) system with data-augmented acoustic and language models. We focus on the recognition of Frisian-Dutch radio broadcasts where one of the mixed languages, namely Frisian, is under-resourced. Recently, we have explored how the acoustic modeling (AM) can benefit from monolingual speech data belonging to the high-resourced mixed language. For this purpose, we have trained state-of-the-art AMs on a significantly increased amount of CS speech by applying automatic transcription and monolingual Dutch speech. Moreover, we have improved the language model (LM) by creating CS text in various ways including text generation using recurrent LMs trained on existing CS text. Motivated by the significantly improved CS ASR performance, we delve into the CS detection performance of the same ASR system in this work by reporting CS detection accuracies together with a detailed detection error analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2018

Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech

In this paper, we describe several techniques for improving the acoustic...
research
06/19/2019

Code-Switching Detection Using ASR-Generated Language Posteriors

Code-switching (CS) detection refers to the automatic detection of langu...
research
11/02/2022

Monolingual Recognizers Fusion for Code-switching Speech Recognition

The bi-encoder structure has been intensively investigated in code-switc...
research
06/16/2020

End-to-End Code Switching Language Models for Automatic Speech Recognition

In this paper, we particularly work on the code-switched text, one of th...
research
05/16/2020

Reducing Spelling Inconsistencies in Code-Switching ASR using Contextualized CTC Loss

Code-Switching (CS) remains a challenge for Automatic Speech Recognition...
research
10/04/2017

Syntactic and Semantic Features For Code-Switching Factored Language Models

This paper presents our latest investigations on different features for ...
research
06/10/2021

KARI: KAnari/QCRI's End-to-End systems for the INTERSPEECH 2021 Indian Languages Code-Switching Challenge

In this paper, we present the Kanari/QCRI (KARI) system and the modeling...

Please sign up or login with your details

Forgot password? Click here to reset