AudVowelConsNet: A Phoneme-Level Based Deep CNN Architecture for Clinical Depression Diagnosis

10/30/2020
by   Muhammad Muzammel, et al.
0

Depression is a common and serious mood disorder that negatively affects the patient's capacity of functioning normally in daily tasks. Speech is proven to be a vigorous tool in depression diagnosis. Research in psychiatry concentrated on performing fine-grained analysis on word-level speech components contributing to the manifestation of depression in speech and revealed significant variations at the phoneme-level in depressed speech. On the other hand, research in Machine Learning-based automatic recognition of depression from speech focused on the exploration of various acoustic features for the detection of depression and its severity level. Few have focused on incorporating phoneme-level speech components in automatic assessment systems. In this paper, we propose an Artificial Intelligence (AI) based application for clinical depression recognition and assessment from speech. We investigate the acoustic characteristics of phoneme units, specifically vowels and consonants for depression recognition via Deep Learning. We present and compare three spectrogram-based Deep Neural Network architectures, trained on phoneme consonant and vowel units and their fusion respectively. Our experiments show that the deep learned consonant-based acoustic characteristics lead to better recognition results than vowel-based ones. The fusion of vowel and consonant speech characteristics through a deep network significantly outperforms the single space networks as well as the state-of-art deep learning approaches on the DAIC-WOZ database.

READ FULL TEXT
research
04/30/2018

Investigations on End-to-End Audiovisual Fusion

Audiovisual speech recognition (AVSR) is a method to alleviate the adver...
research
07/13/2020

Stutter Diagnosis and Therapy System Based on Deep Learning

Stuttering, also called stammering, is a communication disorder that bre...
research
08/24/2023

Attention-Based Acoustic Feature Fusion Network for Depression Detection

Depression, a common mental disorder, significantly influences individua...
research
08/02/2018

Sequence Discriminative Training for Deep Learning based Acoustic Keyword Spotting

Speech recognition is a sequence prediction problem. Besides employing v...
research
03/09/2019

The Virtual Doctor: An Interactive Artificial Intelligence based on Deep Learning for Non-Invasive Prediction of Diabetes

Artificial intelligence (AI) will pave the way to a new era in medicine....
research
04/02/2019

The Verbal and Non Verbal Signals of Depression -- Combining Acoustics, Text and Visuals for Estimating Depression Level

Depression is a serious medical condition that is suffered by a large nu...

Please sign up or login with your details

Forgot password? Click here to reset