Towards Code-switched Classification Exploiting Constituent Language Resources

11/03/2020
by   Tanvi Dadu, et al.
6

Code-switching is a commonly observed communicative phenomenon denoting a shift from one language to another within the same speech exchange. The analysis of code-switched data often becomes an assiduous task, owing to the limited availability of data. We propose converting code-switched data into its constituent high resource languages for exploiting both monolingual and cross-lingual settings in this work. This conversion allows us to utilize the higher resource availability for its constituent languages for multiple downstream tasks. We perform experiments for two downstream tasks, sarcasm detection and hate speech detection, in the English-Hindi code-switched setting. These experiments show an increase in 22 speech detection, respectively, compared to the state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2020

Cross-lingual Multispeaker Text-to-Speech under Limited-Data Scenario

Modeling voices for multiple speakers and multiple languages in one text...
research
09/06/2023

On the Challenges of Building Datasets for Hate Speech Detection

Detection of hate speech has been formulated as a standalone application...
research
08/06/2021

Cross-lingual Capsule Network for Hate Speech Detection in Social Media

Most hate speech detection research focuses on a single language, genera...
research
04/29/2020

Meta-Transfer Learning for Code-Switched Speech Recognition

An increasing number of people in the world today speak a mixed-language...
research
10/26/2022

Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks

We present Bloom Library, a linguistically diverse set of multimodal and...
research
06/16/2022

XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence

Recent advances in machine learning have significantly improved the unde...
research
06/13/2019

Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text

Multilingual writers and speakers often alternate between two languages ...

Please sign up or login with your details

Forgot password? Click here to reset