Beyond Toxic: Toxicity Detection Datasets are Not Enough for Brand Safety

03/27/2023
by   Elizaveta Korotkova, et al.
0

The rapid growth in user generated content on social media has resulted in a significant rise in demand for automated content moderation. Various methods and frameworks have been proposed for the tasks of hate speech detection and toxic comment classification. In this work, we combine common datasets to extend these tasks to brand safety. Brand safety aims to protect commercial branding by identifying contexts where advertisements should not appear and covers not only toxicity, but also other potentially harmful content. As these datasets contain different label sets, we approach the overall problem as a binary classification task. We demonstrate the need for building brand safety specific datasets via the application of common toxicity detection datasets to a subset of brand safety and empirically analyze the effects of weighted sampling strategies in text classification.

READ FULL TEXT
research
09/08/2023

Down the Toxicity Rabbit Hole: Investigating PaLM 2 Guardrails

This paper conducts a robustness audit of the safety feedback of PaLM 2 ...
research
06/09/2021

Automatic Sexism Detection with Multilingual Transformer Models

Sexism has become an increasingly major problem on social networks durin...
research
08/02/2022

BEIKE NLP at SemEval-2022 Task 4: Prompt-Based Paragraph Classification for Patronizing and Condescending Language Detection

PCL detection task is aimed at identifying and categorizing language tha...
research
01/15/2021

Walk in Wild: An Ensemble Approach for Hostility Detection in Hindi Posts

As the reach of the internet increases, pejorative terms started floodin...
research
05/14/2020

OSACT4 Shared Task on Offensive Language Detection: Intensive Preprocessing-Based Approach

The preprocessing phase is one of the key phases within the text classif...
research
10/28/2020

Towards Ethics by Design in Online Abusive Content Detection

To support safety and inclusion in online communications, significant ef...
research
09/10/2019

Spam filtering on forums: A synthetic oversampling based approach for imbalanced data classification

Forums play an important role in providing a platform for community inte...

Please sign up or login with your details

Forgot password? Click here to reset