Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms

08/21/2022
by   Junghun Kim, et al.
0

Attention has become one of the most commonly used mechanisms in deep learning approaches. The attention mechanism can help the system focus more on the feature space's critical regions. For example, high amplitude regions can play an important role for Speech Emotion Recognition (SER). In this paper, we identify misalignments between the attention and the signal amplitude in the existing multi-head self-attention. To improve the attention area, we propose to use a Focus-Attention (FA) mechanism and a novel Calibration-Attention (CA) mechanism in combination with the multi-head self-attention. Through the FA mechanism, the network can detect the largest amplitude part in the segment. By employing the CA mechanism, the network can modulate the information flow by assigning different weights to each attention head and improve the utilization of surrounding contexts. To evaluate the proposed method, experiments are performed with the IEMOCAP and RAVDESS datasets. Experimental results show that the proposed framework significantly outperforms the state-of-the-art approaches on both datasets.

READ FULL TEXT
research
04/24/2019

A Self-Attentive Emotion Recognition Network

Modern deep learning approaches have achieved groundbreaking performance...
research
05/13/2020

Memory Controlled Sequential Self Attention for Sound Recognition

In this paper we investigate the importance of the extent of memory in s...
research
02/13/2023

Learning to Scale Temperature in Masked Self-Attention for Image Inpainting

Recent advances in deep generative adversarial networks (GAN) and self-a...
research
03/03/2023

DWFormer: Dynamic Window transFormer for Speech Emotion Recognition

Speech emotion recognition is crucial to human-computer interaction. The...
research
02/27/2023

DST: Deformable Speech Transformer for Emotion Recognition

Enabled by multi-head self-attention, Transformer has exhibited remarkab...
research
11/26/2019

Low Rank Factorization for Compact Multi-Head Self-Attention

Effective representation learning from text has been an active area of r...
research
05/02/2023

ARBEx: Attentive Feature Extraction with Reliability Balancing for Robust Facial Expression Learning

In this paper, we introduce a framework ARBEx, a novel attentive feature...

Please sign up or login with your details

Forgot password? Click here to reset