Semi-supervised acoustic model training for five-lingual code-switched ASR

06/20/2019
by   Astik Biswas, et al.
0

This paper presents recent progress in the acoustic modelling of under-resourced code-switched (CS) speech in multiple South African languages. We consider two approaches. The first constructs separate bilingual acoustic models corresponding to language pairs (English-isiZulu, English-isiXhosa, English-Setswana and English-Sesotho). The second constructs a single unified five-lingual acoustic model representing all the languages (English, isiZulu, isiXhosa, Setswana and Sesotho). For these two approaches we consider the effectiveness of semi-supervised training to increase the size of the very sparse acoustic training sets. Using approximately 11 hours of untranscribed speech, we show that both approaches benefit from semi-supervised training. The bilingual TDNN-F acoustic models also benefit from the addition of CNN layers (CNN-TDNN-F), while the five-lingual system does not show any significant improvement. Furthermore, because English is common to all language pairs in our data, it dominates when training a unified language model, leading to improved English ASR performance at the expense of the other languages. Nevertheless, the five-lingual model offers flexibility because it can process more than two languages simultaneously, and is therefore an attractive option as an automatic transcription system in a semi-supervised training pipeline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2020

Semi-supervised acoustic and language model training for English-isiZulu code-switched speech recognition

We present an analysis of semi-supervised acoustic and language model tr...
research
03/06/2020

Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages

This paper reports on the semi-supervised development of acoustic and la...
research
04/08/2020

Semi-supervised acoustic modelling for five-lingual code-switched ASR using automatically-segmented soap opera speech

This paper considers the impact of automatic segmentation on the fully-a...
research
06/16/2018

Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

In this paper, we present our overall efforts to improve the performance...
research
05/29/2020

Improving Unsupervised Sparsespeech Acoustic Models with Categorical Reparameterization

The Sparsespeech model is an unsupervised acoustic model that can genera...
research
03/07/2017

Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification

This paper presents a novel approach for multi-lingual sentiment classif...
research
10/13/2020

Towards Data-efficient Modeling for Wake Word Spotting

Wake word (WW) spotting is challenging in far-field not only because of ...

Please sign up or login with your details

Forgot password? Click here to reset