Differentiable Allophone Graphs for Language-Universal Speech Recognition

07/24/2021
by   Brian Yan, et al.
0

Building language-universal speech recognition systems entails producing phonological units of spoken sound that can be shared across languages. While speech annotations at the language-specific phoneme or surface levels are readily available, annotations at a universal phone level are relatively rare and difficult to produce. In this work, we present a general framework to derive phone-level supervision from only phonemic transcriptions and phone-to-phoneme mappings with learnable weights represented using weighted finite-state transducers, which we call differentiable allophone graphs. By training multilingually, we build a universal phone-based speech recognition model with interpretable probabilistic phone-to-phoneme mappings for each language. These phone-based systems with learned allophone graphs can be used by linguists to document new languages, build phone-based lexicons that capture rich pronunciation variations, and re-evaluate the allophone mappings of seen language. We demonstrate the aforementioned benefits of our proposed framework with a system trained on 7 diverse languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2021

Tusom2021: A Phonetically Transcribed Speech Dataset from an Endangered Language for Universal Phone Recognition Experiments

There is growing interest in ASR systems that can recognize phones in a ...
research
01/26/2022

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

The high cost of data acquisition makes Automatic Speech Recognition (AS...
research
05/17/2023

Boosting Local Spectro-Temporal Features for Speech Analysis

We introduce the problem of phone classification in the context of speec...
research
07/15/2018

Syllabification by Phone Categorization

Syllables play an important role in speech synthesis, speech recognition...
research
04/17/2020

AlloVera: A Multilingual Allophone Database

We introduce a new resource, AlloVera, which provides mappings from 218 ...
research
04/04/2021

Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties

Models pre-trained on multiple languages have shown significant promise ...
research
08/23/2019

Multilingual and Multimode Phone Recognition System for Indian Languages

The aim of this paper is to develop a flexible framework capable of auto...

Please sign up or login with your details

Forgot password? Click here to reset