SIGTYP 2021 Shared Task: Robust Spoken Language Identification

06/07/2021
by   Elizabeth Salesky, et al.
4

While language identification is a fundamental speech and language processing task, for many languages and language families it remains a challenging task. For many low-resource and endangered languages this is in part due to resource availability: where larger datasets exist, they may be single-speaker or have different domains than desired application scenarios, demanding a need for domain and speaker-invariant language identification systems. This year's shared task on robust spoken language identification sought to investigate just this scenario: systems were to be trained on largely single-speaker speech from one domain, but evaluated on data in other domains recorded from speakers under different recording circumstances, mimicking realistic low-resource scenarios. We see that domain and speaker mismatch proves very challenging for current methods which can perform above 95 can address to some degree, but that these conditions merit further investigation to make spoken language identification accessible in many scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2021

Language ID Prediction from Speech Using Self-Attentive Pooling and 1D-Convolutions

This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on pr...
research
06/11/2021

Spoken Term Detection Methods for Sparse Transcription in Very Low-resource Settings

We investigate the efficiency of two very different spoken term detectio...
research
08/02/2020

Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages

State-of-the-art spoken language identification (LID) systems, which are...
research
09/06/2023

RoDia: A New Dataset for Romanian Dialect Identification from Speech

Dialect identification is a critical task in speech processing and langu...
research
09/19/2023

Multimodal Modeling For Spoken Language Identification

Spoken language identification refers to the task of automatically predi...
research
03/07/2022

Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features

While neural text-to-speech systems perform remarkably well in high-reso...
research
05/31/2021

Low-Resource Spoken Language Identification Using Self-Attentive Pooling and Deep 1D Time-Channel Separable Convolutions

This memo describes NTR/TSU winning submission for Low Resource ASR chal...

Please sign up or login with your details

Forgot password? Click here to reset