Multitask vocal burst modeling with ResNets and pre-trained paralinguistic Conformers

06/24/2022
by   Josh Belanich, et al.
0

This technical report presents the modeling approaches used in our submission to the ICML Expressive Vocalizations Workshop Competition multitask track (ExVo-MultiTask). We first applied image classification models of various sizes on mel-spectrogram representations of the vocal bursts, as is standard in sound event detection literature. Results from these models show an increase of 21.24 metrics, and comprise our team's main submission to the MultiTask track. We then sought to characterize the headroom in the MultiTask track by applying a large pre-trained Conformer model that previously achieved state-of-the-art results on paralinguistic tasks like speech emotion recognition and mask detection. We additionally investigated the relationship between the sub-tasks of emotional expression, country of origin, and age prediction, and discovered that the best performing models are trained as single-task models, questioning whether the problem truly benefits from a multitask setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2022

Self-supervision and Learnable STRFs for Age, Emotion, and Country Prediction

This work presents a multitask approach to the simultaneous estimation o...
research
02/11/2021

Disentanglement for audio-visual emotion recognition using multitask setup

Deep learning models trained on audio-visual data have been successfully...
research
05/06/2020

Multitask Models for Supervised Protests Detection in Texts

The CLEF 2019 ProtestNews Lab tasks participants to identify text relati...
research
01/14/2022

Polarity and Subjectivity Detection with Multitask Learning and BERT Embedding

Multitask learning often helps improve the performance of related tasks ...
research
06/09/2020

Learning Functions to Study the Benefit of Multitask Learning

We study and quantify the generalization patterns of multitask learning ...
research
01/24/2019

Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification

The core of evidence-based medicine is to read and analyze numerous pape...
research
04/01/2021

StyleML: Stylometry with Structure and Multitask Learning for Darkweb Markets

Darknet market forums are frequently used to exchange illegal goods and ...

Please sign up or login with your details

Forgot password? Click here to reset