DDKtor: Automatic Diadochokinetic Speech Analysis

06/29/2022
by   Yael Segal, et al.
0

Diadochokinetic speech tasks (DDK), in which participants repeatedly produce syllables, are commonly used as part of the assessment of speech motor impairments. These studies rely on manual analyses that are time-intensive, subjective, and provide only a coarse-grained picture of speech. This paper presents two deep neural network models that automatically segment consonants and vowels from unannotated, untranscribed speech. Both models work on the raw waveform and use convolutional layers for feature extraction. The first model is based on an LSTM classifier followed by fully connected layers, while the second model adds more convolutional layers followed by fully connected layers. These segmentations predicted by the models are used to obtain measures of speech rate and sound duration. Results on a young healthy individuals dataset show that our LSTM model outperforms the current state-of-the-art systems and performs comparably to trained human annotators. Moreover, the LSTM model also presents comparable results to trained human annotators when evaluated on unseen older individuals with Parkinson's Disease dataset.

READ FULL TEXT
research
03/07/2017

Raw Waveform-based Speech Enhancement by Fully Convolutional Networks

This study proposes a fully convolutional network (FCN) model for raw wa...
research
12/10/2021

An Ensemble 1D-CNN-LSTM-GRU Model with Data Augmentation for Speech Emotion Recognition

In this paper, we propose an ensemble of deep neural networks along with...
research
11/11/2018

A Multi-modal Deep Neural Network approach to Bird-song identification

We present a multi-modal Deep Neural Network (DNN) approach for bird son...
research
12/17/2018

Fully Convolutional Speech Recognition

Current state-of-the-art speech recognition systems build on recurrent n...
research
01/03/2020

Question Type Classification Methods Comparison

The paper presents a comparative study of state-of-the-art approaches fo...
research
10/26/2017

Lip2AudSpec: Speech reconstruction from silent lip movements video

In this study, we propose a deep neural network for reconstructing intel...
research
12/13/2019

Seizure Prediction Using Bidirectional LSTM

Approximately, 50 million people in the world are affected by epilepsy. ...

Please sign up or login with your details

Forgot password? Click here to reset