WADER at SemEval-2023 Task 9: A Weak-labelling framework for Data augmentation in tExt Regression Tasks

03/05/2023
by   Manan Suri, et al.
0

Intimacy is an essential element of human relationships and language is a crucial means of conveying it. Textual intimacy analysis can reveal social norms in different contexts and serve as a benchmark for testing computational models' ability to understand social information. In this paper, we propose a novel weak-labeling strategy for data augmentation in text regression tasks called WADER. WADER uses data augmentation to address the problems of data imbalance and data scarcity and provides a method for data augmentation in cross-lingual, zero-shot tasks. We benchmark the performance of State-of-the-Art pre-trained multilingual language models using WADER and analyze the use of sampling techniques to mitigate bias in data and optimally select augmentation candidates. Our results show that WADER outperforms the baseline model and provides a direction for mitigating data imbalance and scarcity in text regression tasks.

READ FULL TEXT

page 3

page 5

research
08/31/2021

Cross-Lingual Text Classification of Transliterated Hindi and Malayalam

Transliteration is very common on social media, but transliterated text ...
research
04/28/2020

MultiMix: A Robust Data Augmentation Strategy for Cross-Lingual NLP

Transfer learning has yielded state-of-the-art results in many supervise...
research
10/01/2022

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

Large pre-trained language models have brought remarkable progress in NL...
research
06/11/2020

CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP

Multi-lingual contextualized embeddings, such as multilingual-BERT (mBER...
research
08/04/2020

NLPDove at SemEval-2020 Task 12: Improving Offensive Language Detection with Cross-lingual Transfer

This paper describes our approach to the task of identifying offensive l...
research
01/20/2023

Data Augmentation for Modeling Human Personality: The Dexter Machine

Modeling human personality is important for several AI challenges, from ...
research
06/21/2022

KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Few-Shot NLP

This paper focuses on text data augmentation for few-shot NLP tasks. The...

Please sign up or login with your details

Forgot password? Click here to reset