Fixed-MAML for Few Shot Classification in Multilingual Speech Emotion Recognition

by   Anugunj Naman, et al.

In this paper, we analyze the feasibility of applying few-shot learning to speech emotion recognition task (SER). The current speech emotion recognition models work exceptionally well but fail when then input is multilingual. Moreover, when training such models, the models' performance is suitable only when the training corpus is vast. This availability of a big training corpus is a significant problem when choosing a language that is not much popular or obscure. We attempt to solve this challenge of multilingualism and lack of available data by turning this problem into a few-shot learning problem. We suggest relaxing the assumption that all N classes in an N-way K-shot problem be new and define an N+F way problem where N and F are the number of emotion classes and predefined fixed classes, respectively. We propose this modification to the Model-Agnostic MetaLearning (MAML) algorithm to solve the problem and call this new model F-MAML. This modification performs better than the original MAML and outperforms on EmoFilm dataset.



There are no comments yet.


page 4


Cross Lingual Cross Corpus Speech Emotion Recognition

The majority of existing speech emotion recognition models are trained a...

The Role of Phonetic Units in Speech Emotion Recognition

We propose a method for emotion recognition through emotiondependent spe...

Cross Corpus Speech Emotion Classification- An Effective Transfer Learning Technique

Cross-corpus speech emotion recognition can be a useful transfer learnin...

Multilingual and Multilabel Emotion Recognition using Virtual Adversarial Training

Virtual Adversarial Training (VAT) has been effective in learning robust...

Few-shot Learning in Emotion Recognition of Spontaneous Speech Using a Siamese Neural Network with Adaptive Sample Pair Formation

Speech-based machine learning (ML) has been heralded as a promising solu...

Cross Domain Emotion Recognition using Few Shot Knowledge Transfer

Emotion recognition from text is a challenging task due to diverse emoti...

Speech Emotion Recognition System by Quaternion Nonlinear Echo State Network

The echo state network (ESN) is a powerful and efficient tool for displa...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.