Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages

08/02/2020
by   Badr M. Abdullah, et al.
0

State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language. However, it is still unclear to what extent neural LID models generalize to speech samples with different acoustic conditions due to domain shift. In this paper, we present a set of experiments to investigate the impact of domain mismatch on the performance of neural LID systems for a subset of six Slavic languages across two domains (read speech and radio broadcast) and examine two low-level signal descriptors (spectral and cepstral features) for this task. Our experiments show that (1) out-of-domain speech samples severely hinder the performance of neural LID models, and (2) while both spectral and cepstral features show comparable performance within-domain, spectral features show more robustness under domain mismatch. Moreover, we apply unsupervised domain adaptation to minimize the discrepancy between the two domains in our study. We achieve relative accuracy improvements that range from 9 depending on the diversity of acoustic conditions in the source domain.

READ FULL TEXT
research
12/24/2020

Unsupervised neural adaptation model based on optimal transport for spoken language identification

Due to the mismatch of statistical distributions of acoustic speech betw...
research
06/07/2021

SIGTYP 2021 Shared Task: Robust Spoken Language Identification

While language identification is a fundamental speech and language proce...
research
03/09/2020

Toward Cross-Domain Speech Recognition with End-to-End Models

In the area of multi-domain speech recognition, research in the past foc...
research
12/31/2022

Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek

Modern speech recognition systems exhibits rapid performance degradation...
research
10/27/2022

AmberNet: A Compact End-to-End Model for Spoken Language Identification

We present AmberNet, a compact end-to-end neural network for Spoken Lang...
research
05/10/2021

Cross-Corpora Language Recognition: A Preliminary Investigation with Indian Languages

In this paper, we conduct one of the very first studies for cross-corpor...
research
09/05/2020

Cross-domain Adaptation with Discrepancy Minimization for Text-independent Forensic Speaker Verification

Forensic audio analysis for speaker verification offers unique challenge...

Please sign up or login with your details

Forgot password? Click here to reset