Text Detoxification using Large Pre-trained Neural Models

09/18/2021
by   David Dale, et al.
0

We present two novel unsupervised methods for eliminating toxicity in text. Our first method combines two recent ideas: (1) guidance of the generation process with small style-conditional language models and (2) use of paraphrasing models to perform style transfer. We use a well-performing paraphraser guided by style-trained language models to keep the text content and remove toxicity. Our second method uses BERT to replace toxic words with their non-offensive synonyms. We make the method more flexible by enabling BERT to replace mask tokens with a variable number of words. Finally, we present the first large-scale comparative study of style transfer models on the task of toxicity removal. We compare our models with a number of methods for style transfer. The models are evaluated in a reference-free way using a combination of unsupervised style transfer metrics. Both methods we suggest yield new SOTA results.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/23/2022

Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models

We propose a method for arbitrary textual style transfer (TST)–the task ...
09/13/2022

Exploring Code Style Transfer with Neural Networks

Style is a significant component of natural language text, reflecting a ...
06/01/2021

Improving Formality Style Transfer with Context-Aware Rule Injection

Models pre-trained on large-scale regular text corpora often do not work...
05/18/2021

LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer

Many types of text style transfer can be achieved with only small, preci...
10/02/2020

Unsupervised Text Style Transfer with Padded Masked Language Models

We propose Masker, an unsupervised text-editing method for style transfe...
08/16/2019

How Sequence-to-Sequence Models Perceive Language Styles?

Style is ubiquitous in our daily language uses, while what is language s...
10/22/2020

Multi-dimensional Style Transfer for Partially Annotated Data using Language Models as Discriminators

Style transfer has been widely explored in natural language generation w...