Psychophysiology-aided Perceptually Fluent Speech Analysis of Children Who Stutter

11/16/2022
by   Yi Xiao, et al.
0

This first-of-its-kind paper presents a novel approach named PASAD that detects changes in perceptually fluent speech acoustics of young children. Particularly, analysis of perceptually fluent speech enables identifying the speech-motor-control factors that are considered as the underlying cause of stuttering disfluencies. Recent studies indicate that the speech production of young children, especially those who stutter, may get adversely affected by situational physiological arousal. A major contribution of this paper is leveraging the speaker's situational physiological responses in real-time to analyze the speech signal effectively. The presented PASAD approach adapts a Hyper-Network structure to extract temporal speech importance information leveraging physiological parameters. In addition, a novel non-local acoustic spectrogram feature extraction network identifies meaningful acoustic attributes. Finally, a sequential network utilizes the acoustic attributes and the extracted temporal speech importance for effective classification. We collected speech and physiological sensing data from 73 preschool-age children who stutter (CWS) and who don't stutter (CWNS) in different conditions. PASAD's unique architecture enables visualizing speech attributes distinct to a CWS's fluent speech and mapping them to the speaker's respective speech-motor-control factors (i.e., speech articulators). Extracted knowledge can enhance understanding of children's fluent speech, speech-motor-control (SMC), and stuttering development. Our comprehensive evaluation shows that PASAD outperforms state-of-the-art multi-modal baseline approaches in different conditions, is expressive and adaptive to the speaker's speech and physiology, generalizable, robust, and is real-time executable on mobile and scalable devices.

READ FULL TEXT

page 7

page 15

page 18

research
05/25/2016

On model architecture for a children's speech recognition interactive dialog system

This report presents a general model of the architecture of information ...
research
09/27/2022

Automated Sex Classification of Children's Voices and Changes in Differentiating Factors with Age

Sex classification of children's voices allows for an investigation of t...
research
12/09/2020

DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis

Automatic speaker recognition algorithms typically characterize speech a...
research
10/19/2022

Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning

One of the major challenges in acoustic modelling of child speech is the...
research
08/04/2021

Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

This paper presents Daft-Exprt, a multi-speaker acoustic model advancing...
research
03/30/2022

Speech and the n-Back task as a lens into depression. How combining both may allow us to isolate different core symptoms of depression

Embedded in any speech signal is a rich combination of cognitive, neurom...
research
02/14/2023

Speaker-Independent Acoustic-to-Articulatory Speech Inversion

To build speech processing methods that can handle speech as naturally a...

Please sign up or login with your details

Forgot password? Click here to reset