Transfer Learning based Speech Affect Recognition in Urdu

03/05/2021
by   Sara Durrani, et al.
0

It has been established that Speech Affect Recognition for low resource languages is a difficult task. Here we present a Transfer learning based Speech Affect Recognition approach in which: we pre-train a model for high resource language affect recognition task and fine tune the parameters for low resource language using Deep Residual Network. Here we use standard four data sets to demonstrate that transfer learning can solve the problem of data scarcity for Affect Recognition task. We demonstrate that our approach is efficient by achieving 74.7 percent UAR on RAVDESS as source and Urdu data set as a target. Through an ablation study, we have identified that pre-trained model adds most of the features information, improvement in results and solves less data issues. Using this knowledge, we have also experimented on SAVEE and EMO-DB data set by setting Urdu as target language where only 400 utterances of data is available. This approach achieves high Unweighted Average Recall (UAR) when compared with existing algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2021

Transfer learning from High-Resource to Low-Resource Language Improves Speech Affect Recognition Classification Accuracy

Speech Affect Recognition is a problem of extracting emotional affects f...
research
09/05/2018

Pre-training on high-resource speech recognition improves low-resource speech-to-text translation

We present a simple approach to improve direct speech-to-text translatio...
research
06/09/2020

Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation

Transfer learning from high-resource languages is known to be an efficie...
research
11/17/2022

Low-Resource Mongolian Speech Synthesis Based on Automatic Prosody Annotation

While deep learning-based text-to-speech (TTS) models such as VITS have ...
research
04/29/2020

Meta-Transfer Learning for Code-Switched Speech Recognition

An increasing number of people in the world today speak a mixed-language...
research
06/21/2023

Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection

We compare using a PHOIBLE-based phone mapping method and using phonolog...
research
10/25/2022

This joke is [MASK]: Recognizing Humor and Offense with Prompting

Humor is a magnetic component in everyday human interactions and communi...

Please sign up or login with your details

Forgot password? Click here to reset