Autosegmental Neural Nets: Should Phones and Tones be Synchronous or Asynchronous?

07/28/2020
by   Jialu Li, et al.
0

Phones, the segmental units of the International Phonetic Alphabet (IPA), are used for lexical distinctions in most human languages; Tones, the suprasegmental units of the IPA, are used in perhaps 70 have explored cross-lingual adaptation of automatic speech recognition (ASR) phone models, but few have explored the multilingual and cross-lingual transfer of synchronization between phones and tones. In this paper, we test four Connectionist Temporal Classification (CTC)-based acoustic models, differing in the degree of synchrony they impose between phones and tones. Models are trained and tested multilingually in three languages, then adapted and tested cross-lingually in a fourth. Both synchronous and asynchronous models are effective in both multilingual and cross-lingual settings. Synchronous models achieve lower error rate in the joint phone+tone tier, but asynchronous training results in lower tone error rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2017

Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model

Phoneme-based multilingual training and different cross-lingual adaptati...
research
02/25/2022

A Survey of Multilingual Models for Automatic Speech Recognition

Although Automatic Speech Recognition (ASR) systems have achieved human-...
research
10/07/2021

Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0

We propose a simple and effective cross-lingual transfer learning method...
research
06/15/2022

Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition

Articulatory features are inherently invariant to acoustic signal distor...
research
12/13/2016

Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints

Mismatched transcriptions have been proposed as a mean to acquire probab...
research
06/24/2020

Unsupervised Cross-lingual Representation Learning for Speech Recognition

This paper presents XLSR which learns cross-lingual speech representatio...
research
12/17/2020

The effectiveness of unsupervised subword modeling with autoregressive and cross-lingual phone-aware networks

This study addresses unsupervised subword modeling, i.e., learning acous...

Please sign up or login with your details

Forgot password? Click here to reset