GM-TCNet: Gated Multi-scale Temporal Convolutional Network using Emotion Causality for Speech Emotion Recognition

10/28/2022
by   Jia-Xin Ye, et al.
0

In human-computer interaction, Speech Emotion Recognition (SER) plays an essential role in understanding the user's intent and improving the interactive experience. While similar sentimental speeches own diverse speaker characteristics but share common antecedents and consequences, an essential challenge for SER is how to produce robust and discriminative representations through causality between speech emotions. In this paper, we propose a Gated Multi-scale Temporal Convolutional Network (GM-TCNet) to construct a novel emotional causality representation learning component with a multi-scale receptive field. GM-TCNet deploys a novel emotional causality representation learning component to capture the dynamics of emotion across the time domain, constructed with dilated causal convolution layer and gating mechanism. Besides, it utilizes skip connection fusing high-level features from different gated convolution blocks to capture abundant and subtle emotion changes in human speech. GM-TCNet first uses a single type of feature, mel-frequency cepstral coefficients, as inputs and then passes them through the gated temporal convolutional module to generate the high-level features. Finally, the features are fed to the emotion classifier to accomplish the SER task. The experimental results show that our model maintains the highest performance in most cases compared to state-of-the-art techniques.

READ FULL TEXT

page 18

page 19

page 20

research
11/14/2022

Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition

Speech emotion recognition (SER) plays a vital role in improving the int...
research
12/23/2019

Learning Transferable Features for Speech Emotion Recognition

Emotion recognition from speech is one of the key steps towards emotiona...
research
07/12/2017

A breakthrough in Speech emotion recognition using Deep Retinal Convolution Neural Networks

Speech emotion recognition (SER) is to study the formation and change of...
research
03/11/2021

Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality

We present Affect2MM, a learning method for time-series emotion predicti...
research
04/07/2021

TSception: Capturing Temporal Dynamics and Spatial Asymmetry from EEG for Emotion Recognition

In this paper, we propose TSception, a multi-scale convolutional neural ...
research
07/18/2022

CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition

Speech Emotion Recognition (SER) has become a growing focus of research ...
research
03/14/2023

A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition

As a common way of emotion signaling via non-linguistic vocalizations, v...

Please sign up or login with your details

Forgot password? Click here to reset