Automatic Analysis of the Emotional Content of Speech in Daylong Child-Centered Recordings from a Neonatal Intensive Care Unit

06/14/2021
by   Einari Vaaras, et al.
9

Researchers have recently started to study how the emotional speech heard by young infants can affect their developmental outcomes. As a part of this research, hundreds of hours of daylong recordings from preterm infants' audio environments were collected from two hospitals in Finland and Estonia in the context of so-called APPLE study. In order to analyze the emotional content of speech in such a massive dataset, an automatic speech emotion recognition (SER) system is required. However, there are no emotion labels or existing indomain SER systems to be used for this purpose. In this paper, we introduce this initially unannotated large-scale real-world audio dataset and describe the development of a functional SER system for the Finnish subset of the data. We explore the effectiveness of alternative state-of-the-art techniques to deploy a SER system to a new domain, comparing cross-corpus generalization, WGAN-based domain adaptation, and active learning in the task. As a result, we show that the best-performing models are able to achieve a classification performance of 73.4 for valence and arousal, respectively. The results also show that active learning achieves the most consistent performance compared to the two alternatives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/10/2019

Measuring Mother-Infant Emotions By Audio Sensing

It has been suggested in developmental psychology literature that the co...
research
03/28/2019

Barking up the Right Tree: Improving Cross-Corpus Speech Emotion Recognition with Adversarial Discriminative Domain Generalization (ADDoG)

Automatic speech emotion recognition provides computers with critical co...
research
06/17/2021

EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

Recently, there has been an increasing interest in neural speech synthes...
research
10/26/2022

Pretrained audio neural networks for Speech emotion recognition in Portuguese

The goal of speech emotion recognition (SER) is to identify the emotiona...
research
12/14/2022

Disentangling Prosody Representations with Unsupervised Speech Reconstruction

Human speech can be characterized by different components, including sem...
research
10/09/2021

Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset

Recently, there have been tremendous research outcomes in the fields of ...
research
03/15/2023

Reevaluating Data Partitioning for Emotion Detection in EmoWOZ

This paper focuses on the EmoWoz dataset, an extension of MultiWOZ that ...

Please sign up or login with your details

Forgot password? Click here to reset