BEIKE NLP at SemEval-2022 Task 4: Prompt-Based Paragraph Classification for Patronizing and Condescending Language Detection

08/02/2022
by   Yong Deng, et al.
5

PCL detection task is aimed at identifying and categorizing language that is patronizing or condescending towards vulnerable communities in the general media.Compared to other NLP tasks of paragraph classification, the negative language presented in the PCL detection task is usually more implicit and subtle to be recognized, making the performance of common text-classification approaches disappointed. Targeting the PCL detection problem in SemEval-2022 Task 4, in this paper, we give an introduction to our team's solution, which exploits the power of prompt-based learning on paragraph classification. We reformulate the task as an appropriate cloze prompt and use pre-trained Masked Language Models to fill the cloze slot. For the two subtasks, binary classification and multi-label classification, DeBERTa model is adopted and fine-tuned to predict masked label words of task-specific prompts. On the evaluation dataset, for binary classification, our approach achieves an F1-score of 0.6406; for multi-label classification, our approach achieves an macro-F1-score of 0.4689 and ranks first in the leaderboard.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2022

UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles for Detecting Patronizing and Condescending Language

Patronizing and condescending language (PCL) is everywhere, but rarely i...
research
10/15/2022

Large Language Models for Multi-label Propaganda Detection

The spread of propaganda through the internet has increased drastically ...
research
07/27/2023

ARC-NLP at PAN 2023: Hierarchical Long Text Classification for Trigger Detection

Fanfiction, a popular form of creative writing set within established fi...
research
11/16/2020

Don't Patronize Me! An Annotated Dataset with Patronizing and Condescending Language towards Vulnerable Communities

In this paper, we introduce a new annotated dataset which is aimed at su...
research
01/16/2019

It's Only Words And Words Are All I Have

The central idea of this paper is to demonstrate the strength of lyrics ...
research
03/27/2023

Beyond Toxic: Toxicity Detection Datasets are Not Enough for Brand Safety

The rapid growth in user generated content on social media has resulted ...
research
04/28/2023

HausaNLP at SemEval-2023 Task 10: Transfer Learning, Synthetic Data and Side-Information for Multi-Level Sexism Classification

We present the findings of our participation in the SemEval-2023 Task 10...

Please sign up or login with your details

Forgot password? Click here to reset