A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition

07/03/2022
by   Ying Hu, et al.
0

Speech emotion recognition (SER) is an essential part of human-computer interaction. In this paper, we propose an SER network based on a Graph Isomorphism Network with Weighted Multiple Aggregators (WMA-GIN), which can effectively handle the problem of information confusion when neighbour nodes' features are aggregated together in GIN structure. Moreover, a Full-Adjacent (FA) layer is adopted for alleviating the over-squashing problem, which is existed in all Graph Neural Network (GNN) structures, including GIN. Furthermore, a multi-phase attention mechanism and multi-loss training strategy are employed to avoid missing the useful emotional information in the stacked WMA-GIN layers. We evaluated the performance of our proposed WMA-GIN on the popular IEMOCAP dataset. The experimental results show that WMA-GIN outperforms other GNN-based methods and is comparable to some advanced non-graph-based methods by achieving 72.48 accuracy (UA).

READ FULL TEXT
research
08/21/2022

Representation Learning with Graph Neural Networks for Speech Emotion Recognition

Learning expressive representation is crucial in deep learning. In speec...
research
10/31/2022

Multilingual Speech Emotion Recognition With Multi-Gating Mechanism and Neural Architecture Search

Speech emotion recognition (SER) classifies audio into emotion categorie...
research
08/30/2019

DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation

Emotion recognition in conversation (ERC) has received much attention, l...
research
06/05/2018

Attention Based Fully Convolutional Network for Speech Emotion Recognition

Speech emotion recognition is a challenging task for three main reasons:...
research
10/23/2019

Speech Emotion Recognition via Contrastive Loss under Siamese Networks

Speech emotion recognition is an important aspect of human-computer inte...
research
07/06/2022

GraphCFC: A Directed Graph based Cross-modal Feature Complementation Approach for Multimodal Conversational Emotion Recognition

Emotion Recognition in Conversation (ERC) plays a significant part in Hu...

Please sign up or login with your details

Forgot password? Click here to reset