Multimodal Deep Learning for Mental Disorders Prediction from Audio Speech Samples

09/03/2019
by   Habibeh Naderi, et al.
0

Key features of mental illnesses are reflected in speech. Our research focuses on designing a multimodal deep learning structure that automatically extracts salient features from recorded speech samples for predicting various mental disorders including depression, bipolar, and schizophrenia. We adopt a variety of pre-trained models to extract embeddings from both audio and text segments. We use several state-of-the-art embedding techniques including BERT, FastText, and Doc2VecC for the text representation learning and WaveNet and VGG-ish models for audio encoding. We also leverage huge auxiliary emotion-labeled text and audio corpora to train emotion-specific embeddings and use transfer learning in order to address the problem of insufficient annotated multimodal data available. All these embeddings are then combined into a joint representation in a multimodal fusion layer and finally a recurrent neural network is used to predict the mental disorder. Our results show that mental disorders can be predicted with acceptable accuracy through multimodal analysis of clinical interviews.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2022

Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion Recognition

The research and applications of multimodal emotion recognition have bec...
research
02/27/2023

Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition

The lack of data and the difficulty of multimodal fusion have always bee...
research
11/13/2019

Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis

Multimodal language analysis often considers relationships between featu...
research
08/04/2022

Exploring the Role of Emotion Regulation Difficulties in the Assessment of Mental Disorders

Several studies have been reported in the literature for the automatic d...
research
02/13/2022

Multimodal Depression Classification Using Articulatory Coordination Features And Hierarchical Attention Based Text Embeddings

Multimodal depression classification has gained immense popularity over ...
research
06/27/2022

A Topic-Attentive Transformer-based Model For Multimodal Depression Detection

Depression is one of the most common mental disorders, which imposes hea...
research
09/16/2019

MFCC-based Recurrent Neural Network for Automatic Clinical Depression Recognition and Assessment from Speech

Major depression, also known as clinical depression, is a constant sense...

Please sign up or login with your details

Forgot password? Click here to reset