Code-Switching Detection Using ASR-Generated Language Posteriors

06/19/2019
by   Qinyi Wang, et al.
0

Code-switching (CS) detection refers to the automatic detection of language switches in code-mixed utterances. This task can be achieved by using a CS automatic speech recognition (ASR) system that can handle such language switches. In our previous work, we have investigated the code-switching detection performance of the Frisian-Dutch CS ASR system by using the time alignment of the most likely hypothesis and found that this technique suffers from over-switching due to numerous very short spurious language switches. In this paper, we propose a novel method for CS detection aiming to remedy this shortcoming by using the language posteriors which are the sum of the frame-level posteriors of phones belonging to the same language. The CS ASR-generated language posteriors contain more complete language-specific information on frame level compared to the time alignment of the ASR output. Hence, it is expected to yield more accurate and robust CS detection. The CS detection experiments demonstrate that the proposed language posterior-based approach provides higher detection accuracy than the baseline system in terms of equal error rate. Moreover, a detailed CS detection error analysis reveals that using language posteriors reduces the false alarms and results in more robust CS detection.

READ FULL TEXT

page 2

page 4

research
07/28/2018

Code-Switching Detection with Data-Augmented Acoustic and Language Models

In this paper, we investigate the code-switching detection performance o...
research
10/26/2022

Reducing Language confusion for Code-switching Speech Recognition with Token-level Language Diarization

Code-switching (CS) refers to the phenomenon that languages switch withi...
research
09/27/2019

End-to-End Code-Switching ASR for Low-Resourced Language Pairs

Despite the significant progress in end-to-end (E2E) automatic speech re...
research
05/16/2020

Reducing Spelling Inconsistencies in Code-Switching ASR using Contextualized CTC Loss

Code-Switching (CS) remains a challenge for Automatic Speech Recognition...
research
01/07/2022

Code-Switching Text Augmentation for Multilingual Speech Processing

The pervasiveness of intra-utterance Code-switching (CS) in spoken conte...
research
05/18/2023

CS-TRD: a Cross Sections Tree Ring Detection method

This work describes a Tree Ring Detection method for complete Cross-Sect...
research
03/09/2023

Unsupervised Language agnostic WER Standardization

Word error rate (WER) is a standard metric for the evaluation of Automat...

Please sign up or login with your details

Forgot password? Click here to reset