Recent Progress in the CUHK Dysarthric Speech Recognition System

01/15/2022
by   Shansong Liu, et al.
0

Despite the rapid progress of automatic speech recognition (ASR) technologies in the past few decades, recognition of disordered speech remains a highly challenging task to date. Disordered speech presents a wide spectrum of challenges to current data intensive deep neural networks (DNNs) based ASR technologies that predominantly target normal speech. This paper presents recent research efforts at the Chinese University of Hong Kong (CUHK) to improve the performance of disordered speech recognition systems on the largest publicly available UASpeech dysarthric speech corpus. A set of novel modelling techniques including neural architectural search, data augmentation using spectra-temporal perturbation, model based speaker adaptation and cross-domain generation of visual features within an audio-visual speech recognition (AVSR) system framework were employed to address the above challenges. The combination of these techniques produced the lowest published word error rate (WER) of 25.21 reduction of 5.4 speech recognition system featuring a 6-way DNN system combination and cross adaptation of out-of-domain normal speech data trained systems. Bayesian model adaptation further allows rapid adaptation to individual dysarthric speakers to be performed using as little as 3.06 seconds of speech. The efficacy of these techniques were further demonstrated on a CUDYS Cantonese dysarthric speech recognition task.

READ FULL TEXT
research
02/21/2022

Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition

Despite the rapid progress of automatic speech recognition (ASR) technol...
research
06/17/2019

On combining features for single-channel robust speech recognition in reverberant environments

This paper addresses the combination of complementary parallel speech re...
research
06/15/2022

Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition

Articulatory features are inherently invariant to acoustic signal distor...
research
05/16/2022

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data

Building inclusive speech recognition systems is a crucial step towards ...
research
01/02/2020

Attention based on-device streaming speech recognition with large speech corpus

In this paper, we present a new on-device automatic speech recognition (...
research
01/10/2013

Statistical Modeling in Continuous Speech Recognition (CSR)(Invited Talk)

Automatic continuous speech recognition (CSR) is sufficiently mature tha...

Please sign up or login with your details

Forgot password? Click here to reset