Multilingual and Unsupervised Subword Modeling for Zero-Resource Languages

11/09/2018
by   Enno Hermann, et al.
0

Unsupervised subword modeling aims to learn low-level representations of speech audio in "zero-resource" settings: that is, without using transcriptions or other resources from the target language (such as text corpora or pronunciation dictionaries). A good representation should capture phonetic content and abstract away from other types of variability, such as speaker differences and channel noise. Previous work in this area has primarily focused on learning from target language data only, and has been evaluated only intrinsically. Here we directly compare multiple methods, including some that use only target language speech data and some that use transcribed speech from other (non-target) languages, and we evaluate using two intrinsic measures as well as on a downstream unsupervised word segmentation and clustering task. We find that combining two existing target-language-only methods yields better features than either method alone. Nevertheless, even better results are obtained by extracting target language bottleneck features using a model trained on other languages. Cross-lingual training using just one other language is enough to provide this benefit, but multilingual training helps even more. In addition to these results, which hold across both intrinsic measures and the extrinsic task, we discuss the qualitative differences between the different types of learned features.

READ FULL TEXT

page 1

page 8

research
03/23/2018

Multilingual bottleneck features for subword modeling in zero-resource languages

How can we effectively develop speech technology for languages where no ...
research
08/09/2019

Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword Modeling

This research addresses the problem of acoustic modeling of low-resource...
research
03/30/2022

Probing phoneme, language and speaker information in unsupervised speech representations

Unsupervised models of representations based on Contrastive Predictive C...
research
06/24/2021

Multilingual transfer of acoustic word embeddings improves when training on languages related to the target zero-resource language

Acoustic word embedding models map variable duration speech segments to ...
research
02/05/2017

An Empirical Evaluation of Zero Resource Acoustic Unit Discovery

Acoustic unit discovery (AUD) is a process of automatically identifying ...
research
03/23/2022

Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection

Hate speech classifiers exhibit substantial performance degradation when...
research
05/11/2020

Luganda Text-to-Speech Machine

In Uganda, Luganda is the most spoken native language. It is used for co...

Please sign up or login with your details

Forgot password? Click here to reset