Multi-Graph Decoding for Code-Switching ASR

06/18/2019
by   Emre Yılmaz, et al.
0

In the FAME! Project, a code-switching (CS) automatic speech recognition (ASR) system for Frisian-Dutch speech is developed that can accurately transcribe the local broadcaster's bilingual archives with CS speech. This archive contains recordings with monolingual Frisian and Dutch speech segments as well as Frisian-Dutch CS speech, hence the recognition performance on monolingual segments is also vital for accurate transcriptions. In this work, we propose a multi-graph decoding and rescoring strategy using bilingual and monolingual graphs together with a unified acoustic model for CS ASR. The proposed decoding scheme gives the freedom to design and employ alternative search spaces for each (monolingual or bilingual) recognition task and enables the effective use of monolingual resources of the high-resourced mixed language in low-resourced CS scenarios. In our scenario, Dutch is the high-resourced and Frisian is the low-resourced language. We therefore use additional monolingual Dutch text resources to improve the Dutch language model (LM) and compare the performance of single- and multi-graph CS ASR systems on Dutch segments using larger Dutch LMs. The ASR results show that the proposed approach outperforms baseline single-graph CS ASR systems, providing better performance on the monolingual Dutch segments without any accuracy loss on monolingual Frisian and code-mixed segments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2019

End-to-End Code-Switching ASR for Low-Resourced Language Pairs

Despite the significant progress in end-to-end (E2E) automatic speech re...
research
11/02/2022

Towards Zero-Shot Code-Switched Speech Recognition

In this work, we seek to build effective code-switched (CS) automatic sp...
research
01/07/2022

Code-Switching Text Augmentation for Multilingual Speech Processing

The pervasiveness of intra-utterance Code-switching (CS) in spoken conte...
research
06/10/2021

KARI: KAnari/QCRI's End-to-End systems for the INTERSPEECH 2021 Indian Languages Code-Switching Challenge

In this paper, we present the Kanari/QCRI (KARI) system and the modeling...
research
11/02/2022

Monolingual Recognizers Fusion for Code-switching Speech Recognition

The bi-encoder structure has been intensively investigated in code-switc...
research
05/16/2020

Reducing Spelling Inconsistencies in Code-Switching ASR using Contextualized CTC Loss

Code-Switching (CS) remains a challenge for Automatic Speech Recognition...
research
12/23/2020

Code Switching Language Model Using Monolingual Training Data

Training a code-switching (CS) language model using only monolingual dat...

Please sign up or login with your details

Forgot password? Click here to reset