A Few-Shot Approach to Dysarthric Speech Intelligibility Level Classification Using Transformers

09/17/2023
by   Paleti Nikhil Chowdary, et al.
0

Dysarthria is a speech disorder that hinders communication due to difficulties in articulating words. Detection of dysarthria is important for several reasons as it can be used to develop a treatment plan and help improve a person's quality of life and ability to communicate effectively. Much of the literature focused on improving ASR systems for dysarthric speech. The objective of the current work is to develop models that can accurately classify the presence of dysarthria and also give information about the intelligibility level using limited data by employing a few-shot approach using a transformer model. This work also aims to tackle the data leakage that is present in previous studies. Our whisper-large-v2 transformer model trained on a subset of the UASpeech dataset containing medium intelligibility level patients achieved an accuracy of 85 specificity of 0.91. Experimental results also demonstrate that the model trained using the 'words' dataset performed better compared to the model trained on the 'letters' and 'digits' dataset. Moreover, the multiclass model achieved an accuracy of 67

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2021

FPM: A Collection of Large-scale Foundation Pre-trained Language Models

Recent work in language modeling has shown that training large-scale Tra...
research
11/10/2022

Assistive Completion of Agrammatic Aphasic Sentences: A Transfer Learning Approach using Neurolinguistics-based Synthetic Dataset

Damage to the inferior frontal gyrus (Broca's area) can cause agrammatic...
research
08/05/2021

Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification

End-to-end intent classification using speech has numerous advantages co...
research
05/25/2023

Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration

This work aims to build a multilingual text-to-speech (TTS) synthesis sy...
research
11/01/2021

Transformers for prompt-level EMA non-response prediction

Ecological Momentary Assessments (EMAs) are an important psychological d...
research
05/29/2020

Improving Unsupervised Sparsespeech Acoustic Models with Categorical Reparameterization

The Sparsespeech model is an unsupervised acoustic model that can genera...
research
04/26/2020

Classification of Cuisines from Sequentially Structured Recipes

Cultures across the world are distinguished by the idiosyncratic pattern...

Please sign up or login with your details

Forgot password? Click here to reset