Text Detoxification using Large Pre-trained Neural Models

by   David Dale, et al.

We present two novel unsupervised methods for eliminating toxicity in text. Our first method combines two recent ideas: (1) guidance of the generation process with small style-conditional language models and (2) use of paraphrasing models to perform style transfer. We use a well-performing paraphraser guided by style-trained language models to keep the text content and remove toxicity. Our second method uses BERT to replace toxic words with their non-offensive synonyms. We make the method more flexible by enabling BERT to replace mask tokens with a variable number of words. Finally, we present the first large-scale comparative study of style transfer models on the task of toxicity removal. We compare our models with a number of methods for style transfer. The models are evaluated in a reference-free way using a combination of unsupervised style transfer metrics. Both methods we suggest yield new SOTA results.


page 1

page 2

page 3

page 4


Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models

We propose a method for arbitrary textual style transfer (TST)–the task ...

Exploring Code Style Transfer with Neural Networks

Style is a significant component of natural language text, reflecting a ...

Improving Formality Style Transfer with Context-Aware Rule Injection

Models pre-trained on large-scale regular text corpora often do not work...

LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer

Many types of text style transfer can be achieved with only small, preci...

Unsupervised Text Style Transfer with Padded Masked Language Models

We propose Masker, an unsupervised text-editing method for style transfe...

How Sequence-to-Sequence Models Perceive Language Styles?

Style is ubiquitous in our daily language uses, while what is language s...

Multi-dimensional Style Transfer for Partially Annotated Data using Language Models as Discriminators

Style transfer has been widely explored in natural language generation w...