Efficient Spoken Language Recognition via Multilabel Classification

06/02/2023
by   Oriol Nieto, et al.
0

Spoken language recognition (SLR) is the task of automatically identifying the language present in a speech signal. Existing SLR models are either too computationally expensive or too large to run effectively on devices with limited resources. For real-world deployment, a model should also gracefully handle unseen languages outside of the target language set, yet prior work has focused on closed-set classification where all input languages are known a-priori. In this paper we address these two limitations: we explore efficient model architectures for SLR based on convolutional networks, and propose a multilabel training strategy to handle non-target languages at inference time. Using the VoxLingua107 dataset, we show that our models obtain competitive results while being orders of magnitude smaller and faster than current state-of-the-art methods, and that our multilabel strategy is more robust to unseen non-target languages compared to multiclass classification.

READ FULL TEXT
research
04/23/2019

End-to-End Spoken Language Translation

In this paper, we address the task of spoken language understanding. We ...
research
08/29/2023

Robust Open-Set Spoken Language Identification and the CU MultiLang Dataset

Most state-of-the-art spoken language identification models are closed-s...
research
05/19/2022

Automatic Spoken Language Identification using a Time-Delay Neural Network

Closed-set spoken language identification is the task of recognizing the...
research
05/21/2023

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Recent models such as XLS-R and Whisper have made multilingual speech te...
research
10/20/2022

Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages

Hate speech is a global phenomenon, but most hate speech datasets so far...
research
05/03/2023

Plug-and-Play Multilingual Few-shot Spoken Words Recognition

As technology advances and digital devices become prevalent, seamless hu...
research
03/22/2023

AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages

The advancement of speech technologies has been remarkable, yet its inte...

Please sign up or login with your details

Forgot password? Click here to reset