Context-Aware Attention Layers coupled with Optimal Transport Domain Adaptation methods for recognizing dementia from spontaneous speech

05/25/2023
by   Loukas Ilias, et al.
0

Alzheimer's disease (AD) constitutes a complex neurocognitive disease and is the main cause of dementia. Although many studies have been proposed targeting at diagnosing dementia through spontaneous speech, there are still limitations. Existing state-of-the-art approaches, which propose multimodal methods, train separately language and acoustic models, employ majority-vote approaches, and concatenate the representations of the different modalities either at the input level, i.e., early fusion, or during training. Also, some of them employ self-attention layers, which calculate the dependencies between representations without considering the contextual information. In addition, no prior work has taken into consideration the model calibration. To address these limitations, we propose some new methods for detecting AD patients, which capture the intra- and cross-modal interactions. First, we convert the audio files into log-Mel spectrograms, their delta, and delta-delta and create in this way an image per audio file consisting of three channels. Next, we pass each transcript and image through BERT and DeiT models respectively. After that, context-based self-attention layers, self-attention layers with a gate model, and optimal transport domain adaptation methods are employed for capturing the intra- and inter-modal interactions. Finally, we exploit two methods for fusing the self and cross-attended features. For taking into account the model calibration, we apply label smoothing. We use both performance and calibration metrics. Experiments conducted on the ADReSS Challenge dataset indicate the efficacy of our introduced approaches over existing research initiatives with our best performing model reaching Accuracy and F1-score up to 91.25 respectively.

READ FULL TEXT
research
11/08/2022

A Multimodal Approach for Dementia Detection from Spontaneous Speech with Tensor Fusion Layer

Alzheimer's disease (AD) is a progressive neurological disorder, meaning...
research
10/21/2021

Multimodal Learning using Optimal Transport for Sarcasm and Humor Detection

Multimodal learning is an emerging yet challenging research area. In thi...
research
06/16/2023

A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition

Conversational emotion recognition (CER) is an important research topic ...
research
11/22/2022

Complex-Valued Time-Frequency Self-Attention for Speech Dereverberation

Several speech processing systems have demonstrated considerable perform...
research
02/12/2023

Neural Architecture Search with Multimodal Fusion Methods for Diagnosing Dementia

Alzheimer's dementia (AD) affects memory, thinking, and language, deteri...
research
10/27/2021

Detecting Dementia from Speech and Transcripts using Transformers

Alzheimer's disease (AD) constitutes a neurodegenerative disease with se...
research
08/28/2023

Multimodal Detection of Social Spambots in Twitter using Transformers

Although not all bots are malicious, the vast majority of them are respo...

Please sign up or login with your details

Forgot password? Click here to reset