Spoken Language Identification using ConvNets

10/09/2019
by   Sarthak, et al.
0

Language Identification (LI) is an important first step in several speech processing systems. With a growing number of voice-based assistants, speech LI has emerged as a widely researched field. To approach the problem of identifying languages, we can either adopt an implicit approach where only the speech for a language is present or an explicit one where text is available with its corresponding transcript. This paper focuses on an implicit approach due to the absence of transcriptive data. This paper benchmarks existing models and proposes a new attention based model for language identification which uses log-Mel spectrogram images as input. We also present the effectiveness of raw waveforms as features to neural network models for LI tasks. For training and evaluation of models, we classified six languages (English, French, German, Spanish, Russian and Italian) with an accuracy of 95.4 (English, French, German, Spanish) with an accuracy of 96.3 VoxForge dataset. This approach can further be scaled to incorporate more languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2021

Scribosermo: Fast Speech-to-Text models for German and other Languages

Recent Speech-to-Text models often require a large amount of hardware re...
research
10/14/2020

Exploiting Spectral Augmentation for Code-Switched Spoken Language Identification

Spoken language Identification (LID) systems are needed to identify the ...
research
04/22/2022

LibriS2S: A German-English Speech-to-Speech Translation Corpus

Recently, we have seen an increasing interest in the area of speech-to-t...
research
04/01/2016

A Semisupervised Approach for Language Identification based on Ladder Networks

In this study we address the problem of training a neuralnetwork for lan...
research
03/03/2021

An Attention Based Neural Network for Code Switching Detection: English Roman Urdu

Code-switching is a common phenomenon among people with diverse lingual ...
research
02/08/2021

Effects of Layer Freezing when Transferring DeepSpeech to New Languages

In this paper, we train Mozilla's DeepSpeech architecture on German and ...
research
07/17/2018

Low-Resource Contextual Topic Identification on Speech

In topic identification (topic ID) on real-world unstructured audio, an ...

Please sign up or login with your details

Forgot password? Click here to reset