Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition

03/19/2022
by   Shujie Hu, et al.
0

Articulatory features are inherently invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition (ASR) systems for normal speech. Their practical application to disordered speech recognition is often limited by the difficulty in collecting such specialist data from impaired speakers. This paper presents a cross-domain acoustic-to-articulatory (A2A) inversion approach that utilizes the parallel acoustic-articulatory data of the 15-hour TORGO corpus in model training before being cross-domain adapted to the 102.7-hour UASpeech corpus and to produce articulatory features. Mixture density networks based neural A2A inversion models were used. A cross-domain feature adaptation network was also used to reduce the acoustic mismatch between the TORGO and UASpeech data. On both tasks, incorporating the A2A generated articulatory features consistently outperformed the baseline hybrid DNN/TDNN, CTC and Conformer based end-to-end systems constructed using acoustic features only. The best multi-modal system incorporating video modality and the cross-domain articulatory features as well as data augmentation and learning hidden unit contributions (LHUC) speaker adaptation produced the lowest published word error rate (WER) of 24.82 16 dysarthric speakers of the benchmark UASpeech task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2022

Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition

Articulatory features are inherently invariant to acoustic signal distor...
research
02/28/2023

Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition

Automatic recognition of disordered and elderly speech remains a highly ...
research
01/24/2022

Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition

Dysarthric speech recognition is a challenging task due to acoustic vari...
research
06/22/2020

Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion

This paper presents Articulatory-WaveNet, a new approach for acoustic-to...
research
06/23/2022

Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection

Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating p...
research
11/24/2020

Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech

In recent years, Text-To-Speech (TTS) has been used as a data augmentati...
research
03/19/2018

Acoustic feature learning cross-domain articulatory measurements

Previous work has shown that it is possible to improve speech recognitio...

Please sign up or login with your details

Forgot password? Click here to reset