TrustSER: On the Trustworthiness of Fine-tuning Pre-trained Speech Embeddings For Speech Emotion Recognition

05/18/2023
by   Tiantian Feng, et al.
0

Recent studies have explored the use of pre-trained embeddings for speech emotion recognition (SER), achieving comparable performance to conventional methods that rely on low-level knowledge-inspired acoustic features. These embeddings are often generated from models trained on large-scale speech datasets using self-supervised or weakly-supervised learning objectives. Despite the significant advancements made in SER through the use of pre-trained embeddings, there is a limited understanding of the trustworthiness of these methods, including privacy breaches, unfair performance, vulnerability to adversarial attacks, and computational cost, all of which may hinder the real-world deployment of these systems. In response, we introduce TrustSER, a general framework designed to evaluate the trustworthiness of SER systems using deep learning methods, with a focus on privacy, safety, fairness, and sustainability, offering unique insights into future research in the field of SER. Our code is publicly available under: https://github.com/usc-sail/trust-ser.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2023

PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models

Many recent studies have focused on fine-tuning pre-trained models for s...
research
01/30/2021

LSSED: a large-scale dataset and benchmark for speech emotion recognition

Speech emotion recognition is a vital contributor to the next generation...
research
02/07/2022

Speech Emotion Recognition using Self-Supervised Features

Self-supervised pre-trained features have consistently delivered state-o...
research
02/26/2023

Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations

Fueled by recent advances of self-supervised models, pre-trained speech ...
research
09/07/2023

LanSER: Language-Model Supported Speech Emotion Recognition

Speech emotion recognition (SER) models typically rely on costly human-l...
research
03/31/2022

CTA-RNN: Channel and Temporal-wise Attention RNN Leveraging Pre-trained ASR Embeddings for Speech Emotion Recognition

Previous research has looked into ways to improve speech emotion recogni...
research
04/05/2022

Learning Speech Emotion Representations in the Quaternion Domain

The modeling of human emotion expression in speech signals is an importa...

Please sign up or login with your details

Forgot password? Click here to reset