DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

07/31/2023
by   Hannah Rose Kirk, et al.
0

Public figures receive a disproportionate amount of abuse on social media, impacting their active participation in public life. Automated systems can identify abuse at scale but labelling training data is expensive, complex and potentially harmful. So, it is desirable that systems are efficient and generalisable, handling both shared and specific aspects of online abuse. We explore the dynamics of cross-group text classification in order to understand how well classifiers trained on one domain or demographic can transfer to others, with a view to building more generalisable abuse classifiers. We fine-tune language models to classify tweets targeted at public figures across DOmains (sport and politics) and DemOgraphics (women and men) using our novel DODO dataset, containing 28,000 labelled entries, split equally across four domain-demographic pairs. We find that (i) small amounts of diverse data are hugely beneficial to generalisation and model adaptation; (ii) models transfer more easily across demographics but models trained on cross-domain data are more generalisable; (iii) some groups contribute more to generalisability than others; and (iv) dataset similarity is a signal of transferability.

READ FULL TEXT
research
06/04/2023

Exposing Bias in Online Communities through Large-Scale Language Models

Progress in natural language generation research has been shaped by the ...
research
07/04/2023

Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation

The automatic detection of hate speech online is an active research area...
research
12/15/2021

Cross-Domain Generalization and Knowledge Transfer in Transformers Trained on Legal Data

We analyze the ability of pre-trained language models to transfer knowle...
research
03/02/2022

Large-Scale Hate Speech Detection with Cross-Domain Transfer

The performance of hate speech detection models relies on the datasets o...
research
10/07/2018

Geocoding Without Geotags: A Text-based Approach for reddit

In this paper, we introduce the first geolocation inference approach for...
research
12/31/2020

The Pile: An 800GB Dataset of Diverse Text for Language Modeling

Recent work has demonstrated that increased training dataset diversity i...
research
02/13/2023

Towards Agile Text Classifiers for Everyone

Text-based safety classifiers are widely used for content moderation and...

Please sign up or login with your details

Forgot password? Click here to reset