Articulatory Features for ASR of Pathological Speech

07/28/2018
by   Emre Yilmaz, et al.
0

In this work, we investigate the joint use of articulatory and acoustic features for automatic speech recognition (ASR) of pathological speech. Despite long-lasting efforts to build speaker- and text-independent ASR systems for people with dysarthria, the performance of state-of-the-art systems is still considerably lower on this type of speech than on normal speech. The most prominent reason for the inferior performance is the high variability in pathological speech that is characterized by the spectrotemporal deviations caused by articulatory impairments due to various etiologies. To cope with this high variation, we propose to use speech representations which utilize articulatory information together with the acoustic properties. A designated acoustic model, namely a fused-feature-map convolutional neural network (fCNN), which performs frequency convolution on acoustic features and time convolution on articulatory features is trained and tested on a Dutch and a Flemish pathological speech corpus. The ASR performance of fCNN-based ASR system using joint features is compared to other neural network architectures such conventional CNNs and time-frequency convolutional networks (TFCNNs) in several training scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2019

Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech

The rapid population aging has stimulated the development of assistive d...
research
02/08/2022

Enhancing ASR for Stuttered Speech with Limited Data Using Detect and Pass

It is estimated that around 70 million people worldwide are affected by ...
research
08/09/2021

Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition

In this paper, a CNN-based structure for the time-frequency localization...
research
02/11/2020

CGCNN: Complex Gabor Convolutional Neural Network on raw speech

Convolutional Neural Networks (CNN) have been used in Automatic Speech R...
research
04/23/2018

ASR Performance Prediction on Unseen Broadcast Programs using Convolutional Neural Networks

In this paper, we address a relatively new task: prediction of ASR perfo...
research
01/14/2023

Acoustic correlates of the syllabic rhythm of speech: Modulation spectrum or local features of the temporal envelope

The syllable is a perceptually salient unit in speech. Since both the sy...
research
11/27/2016

Invariant Representations for Noisy Speech Recognition

Modern automatic speech recognition (ASR) systems need to be robust unde...

Please sign up or login with your details

Forgot password? Click here to reset