Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis

11/04/2020
by   Sashi Novitasari, et al.
0

Even though over seven hundred ethnic languages are spoken in Indonesia, the available technology remains limited that could support communication within indigenous communities as well as with people outside the villages. As a result, indigenous communities still face isolation due to cultural barriers; languages continue to disappear. To accelerate communication, speech-to-speech translation (S2ST) technology is one approach that can overcome language barriers. However, S2ST systems require machine translation (MT), speech recognition (ASR), and synthesis (TTS) that rely heavily on supervised training and a broad set of language resources that can be difficult to collect from ethnic communities. Recently, a machine speech chain mechanism was proposed to enable ASR and TTS to assist each other in semi-supervised learning. The framework was initially implemented only for monolingual languages. In this study, we focus on developing speech recognition and synthesis for these Indonesian ethnic languages: Javanese, Sundanese, Balinese, and Bataks. We first separately train ASR and TTS of standard Indonesian in supervised training. We then develop ASR and TTS of ethnic languages by utilizing Indonesian ASR and TTS in a cross-lingual machine speech chain framework with only text or only speech data removing the need for paired speech-text data of those ethnic languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/19/2023

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

In this work, we propose a new parameter-efficient learning framework ba...
research
03/20/2020

Language Technology Programme for Icelandic 2019-2023

In this paper, we describe a new national language technology programme ...
research
10/07/2021

Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0

We propose a simple and effective cross-lingual transfer learning method...
research
02/07/2020

Unsupervised pretraining transfers well across languages

Cross-lingual and multi-lingual training of Automatic Speech Recognition...
research
06/03/2019

From Speech Chain to Multimodal Chain: Leveraging Cross-modal Data Augmentation for Semi-supervised Learning

The most common way for humans to communicate is by speech. But perhaps ...
research
11/04/2020

Incremental Machine Speech Chain Towards Enabling Listening while Speaking in Real-time

Inspired by a human speech chain mechanism, a machine speech chain frame...
research
03/13/2021

OkwuGbé: End-to-End Speech Recognition for Fon and Igbo

Language is inherent and compulsory for human communication. Whether exp...

Please sign up or login with your details

Forgot password? Click here to reset