A Multimodal Approach for Dementia Detection from Spontaneous Speech with Tensor Fusion Layer

11/08/2022
by   Loukas Ilias, et al.
0

Alzheimer's disease (AD) is a progressive neurological disorder, meaning that the symptoms develop gradually throughout the years. It is also the main cause of dementia, which affects memory, thinking skills, and mental abilities. Nowadays, researchers have moved their interest towards AD detection from spontaneous speech, since it constitutes a time-effective procedure. However, existing state-of-the-art works proposing multimodal approaches do not take into consideration the inter- and intra-modal interactions and propose early and late fusion approaches. To tackle these limitations, we propose deep neural networks, which can be trained in an end-to-end trainable way and capture the inter- and intra-modal interactions. Firstly, each audio file is converted to an image consisting of three channels, i.e., log-Mel spectrogram, delta, and delta-delta. Next, each transcript is passed through a BERT model followed by a gated self-attention layer. Similarly, each image is passed through a Swin Transformer followed by an independent gated self-attention layer. Acoustic features are extracted also from each audio file. Finally, the representation vectors from the different modalities are fed to a tensor fusion layer for capturing the inter-modal interactions. Extensive experiments conducted on the ADReSS Challenge dataset indicate that our introduced approaches obtain valuable advantages over existing research initiatives reaching Accuracy and F1-score up to 86.25

READ FULL TEXT

page 1

page 3

research
05/25/2023

Context-Aware Attention Layers coupled with Optimal Transport Domain Adaptation methods for recognizing dementia from spontaneous speech

Alzheimer's disease (AD) constitutes a complex neurocognitive disease an...
research
11/03/2021

A cross-modal fusion network based on self-attention and residual structure for multimodal emotion recognition

The audio-video based multimodal emotion recognition has attracted a lot...
research
10/27/2021

Detecting Dementia from Speech and Transcripts using Transformers

Alzheimer's disease (AD) constitutes a neurodegenerative disease with se...
research
08/28/2023

Multimodal Detection of Social Spambots in Twitter using Transformers

Although not all bots are malicious, the vast majority of them are respo...
research
10/21/2021

Multimodal Learning using Optimal Transport for Sarcasm and Humor Detection

Multimodal learning is an emerging yet challenging research area. In thi...
research
06/07/2018

Multimodal Relational Tensor Network for Sentiment and Emotion Classification

Understanding Affect from video segments has brought researchers from th...
research
04/14/2023

A Byte Sequence is Worth an Image: CNN for File Fragment Classification Using Bit Shift and n-Gram Embeddings

File fragment classification (FFC) on small chunks of memory is essentia...

Please sign up or login with your details

Forgot password? Click here to reset