Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer

We introduce a new approach to tackle the problem of offensive language in online social media. Our approach uses unsupervised text style transfer to translate offensive sentences into non-offensive ones. We propose a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss. Experimental results on data from Twitter and Reddit show that our method outperforms a state-of-the-art text style transfer system in two out of three quantitative metrics and produces reliable non-offensive transferred sentences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2020

Towards A Friendly Online Community: An Unsupervised Style Transfer Framework for Profanity Redaction

Offensive and abusive language is a pressing problem on social media pla...
research
09/16/2021

Transductive Learning for Unsupervised Text Style Transfer

Unsupervised style transfer models are mainly based on an inductive lear...
research
04/20/2023

A Plug-and-Play Defensive Perturbation for Copyright Protection of DNN-based Applications

Wide deployment of deep neural networks (DNNs) based applications (e.g.,...
research
05/18/2022

Exploiting Social Media Content for Self-Supervised Style Transfer

Recent research on style transfer takes inspiration from unsupervised ne...
research
04/11/2021

Instagram Filter Removal on Fashionable Images

Social media images are generally transformed by filtering to obtain aes...
research
09/13/2019

A Neural Approach to Irony Generation

Ironies can not only express stronger emotions but also show a sense of ...
research
05/19/2021

Methods for Detoxification of Texts for the Russian Language

We introduce the first study of automatic detoxification of Russian text...

Please sign up or login with your details

Forgot password? Click here to reset