Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints

09/16/2023
by   Hao Yen, et al.
0

We propose a first step toward multilingual end-to-end automatic speech recognition (ASR) by integrating knowledge about speech articulators. The key idea is to leverage a rich set of fundamental units that can be defined "universally" across all spoken languages, referred to as speech attributes, namely manner and place of articulation. Specifically, several deterministic attribute-to-phoneme mapping matrices are constructed based on the predefined set of universal attribute inventory, which projects the knowledge-rich articulatory attribute logits, into output phoneme logits. The mapping puts knowledge-based constraints to limit inconsistency with acoustic-phonetic evidence in the integrated prediction. Combined with phoneme recognition, our phone recognizer is able to infer from both attribute and phoneme information. The proposed joint multilingual model is evaluated through phoneme recognition. In multilingual experiments over 6 languages on benchmark datasets LibriSpeech and CommonVoice, we find that our proposed solution outperforms conventional multilingual approaches with a relative improvement of 6.85 also demonstrates a much better performance compared to monolingual model. Further analysis conclusively demonstrates that the proposed solution eliminates phoneme predictions that are inconsistent with attributes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2021

A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English

We study training a single end-to-end (E2E) automatic speech recognition...
research
10/22/2020

A multilingual approach to joint Speech and Accent Recognition with DNN-HMM framework

Human can perform multi-task recognition from speech. For instance, huma...
research
02/28/2023

Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition

In this paper, we propose a language-universal adapter learning framewor...
research
11/13/2017

Phonemic and Graphemic Multilingual CTC Based Speech Recognition

Training automatic speech recognition (ASR) systems requires large amoun...
research
05/18/2023

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

Speech processing Universal PERformance Benchmark (SUPERB) is a leaderbo...
research
07/05/2016

Attribute Recognition from Adaptive Parts

Previous part-based attribute recognition approaches perform part detect...
research
01/19/2019

Towards Universal End-to-End Affect Recognition from Multilingual Speech by ConvNets

We propose an end-to-end affect recognition approach using a Convolution...

Please sign up or login with your details

Forgot password? Click here to reset