Few-shot Learning in Emotion Recognition of Spontaneous Speech Using a Siamese Neural Network with Adaptive Sample Pair Formation

09/07/2021
by   Kexin Feng, et al.
0

Speech-based machine learning (ML) has been heralded as a promising solution for tracking prosodic and spectrotemporal patterns in real-life that are indicative of emotional changes, providing a valuable window into one's cognitive and mental state. Yet, the scarcity of labelled data in ambulatory studies prevents the reliable training of ML models, which usually rely on "data-hungry" distribution-based learning. Leveraging the abundance of labelled speech data from acted emotions, this paper proposes a few-shot learning approach for automatically recognizing emotion in spontaneous speech from a small number of labelled samples. Few-shot learning is implemented via a metric learning approach through a siamese neural network, which models the relative distance between samples rather than relying on learning absolute patterns of the corresponding distributions of each emotion. Results indicate the feasibility of the proposed metric learning in recognizing emotions from spontaneous speech in four datasets, even with a small amount of labelled samples. They further demonstrate superior performance of the proposed metric learning compared to commonly used adaptation methods, including network fine-tuning and adversarial learning. Findings from this work provide a foundation for the ambulatory tracking of human emotion in spontaneous speech contributing to the real-life assessment of mental health degradation.

READ FULL TEXT

page 1

page 8

research
10/27/2022

A few-shot learning approach with domain adaptation for personalized real-life stress detection in close relationships

We design a metric learning approach that aims to address computational ...
research
06/04/2020

A Siamese Neural Network with Modified Distance Loss For Transfer Learning in Speech Emotion Recognition

Automatic emotion recognition plays a significant role in the process of...
research
10/28/2021

End-to-End Speech Emotion Recognition: Challenges of Real-Life Emergency Call Centers Data Recordings

Recognizing a speaker's emotion from their speech can be a key element i...
research
10/28/2020

Speech-Based Emotion Recognition using Neural Networks and Information Visualization

Emotions recognition is commonly employed for health assessment. However...
research
01/05/2021

Fixed-MAML for Few Shot Classification in Multilingual Speech Emotion Recognition

In this paper, we analyze the feasibility of applying few-shot learning ...
research
11/30/2021

Affect-DML: Context-Aware One-Shot Recognition of Human Affect using Deep Metric Learning

Human affect recognition is a well-established research area with numero...
research
03/28/2022

Continuous Metric Learning For Transferable Speech Emotion Recognition and Embedding Across Low-resource Languages

Speech emotion recognition (SER) refers to the technique of inferring th...

Please sign up or login with your details

Forgot password? Click here to reset