Reducing Unintended Identity Bias in Russian Hate Speech Detection

10/22/2020
by   Nadezhda Zueva, et al.
0

Toxicity has become a grave problem for many online communities and has been growing across many languages, including Russian. Hate speech creates an environment of intimidation, discrimination, and may even incite some real-world violence. Both researchers and social platforms have been focused on developing models to detect toxicity in online communication for a while now. A common problem of these models is the presence of bias towards some words (e.g. woman, black, jew) that are not toxic, but serve as triggers for the classifier due to model caveats. In this paper, we describe our efforts towards classifying hate speech in Russian, and propose simple techniques of reducing unintended bias, such as generating training data with language models using terms and words related to protected identities as context and applying word dropout to such words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2019

Empirical Analysis of Multi-Task Learning for Reducing Model Bias in Toxic Comment Detection

With the recent rise of toxicity in online conversations on social media...
research
01/15/2020

Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations

With the ever-increasing cases of hate spread on social media platforms,...
research
06/27/2023

Identity Construction in a Misogynist Incels Forum

Online communities of involuntary celibates (incels) are a prominent sou...
research
06/29/2020

Reading Between the Demographic Lines: Resolving Sources of Bias in Toxicity Classifiers

The censorship of toxic comments is often left to the judgment of imperf...
research
10/31/2018

On The Inductive Bias of Words in Acoustics-to-Word Models

Acoustics-to-word models are end-to-end speech recognizers that use word...
research
09/14/2018

Visual Speech Language Models

Language models (LM) are very powerful in lipreading systems. Language m...
research
11/01/2022

Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection

In a hate speech detection model, we should consider two critical aspect...

Please sign up or login with your details

Forgot password? Click here to reset