Weak-Supervised Dysarthria-invariant Features for Spoken Language Understanding using an FHVAE and Adversarial Training

10/24/2022
by   Jinzi Qi, et al.
0

The scarcity of training data and the large speaker variation in dysarthric speech lead to poor accuracy and poor speaker generalization of spoken language understanding systems for dysarthric speech. Through work on the speech features, we focus on improving the model generalization ability with limited dysarthric data. Factorized Hierarchical Variational Auto-Encoders (FHVAE) trained unsupervisedly have shown their advantage in disentangling content and speaker representations. Earlier work showed that the dysarthria shows in both feature vectors. Here, we add adversarial training to bridge the gap between the control and dysarthric speech data domains. We extract dysarthric and speaker invariant features using weak supervision. The extracted features are evaluated on a Spoken Language Understanding task and yield a higher accuracy on unseen speakers with more severe dysarthria compared to features from the basic FHVAE model or plain filterbanks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2020

The importance of fillers for text representations of speech transcripts

While being an essential component of spoken language, fillers (e.g."um"...
research
11/09/2020

Speaker De-identification System using Autoencodersand Adversarial Training

The fast increase of web services and mobile apps, which collect persona...
research
10/03/2017

Understanding the visual speech signal

For machines to lipread, or understand speech from lip movement, they de...
research
06/29/2021

GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis

Recent advances in neural multi-speaker text-to-speech (TTS) models have...
research
09/02/2019

Identifying Personality Traits Using Overlap Dynamics in Multiparty Dialogue

Research on human spoken language has shown that speech plays an importa...
research
09/19/2023

Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation

Speaker diarization has gained considerable attention within speech proc...
research
10/26/2018

Parsing Coordination for Spoken Language Understanding

Typical spoken language understanding systems provide narrow semantic pa...

Please sign up or login with your details

Forgot password? Click here to reset