DiffuDetox: A Mixed Diffusion Model for Text Detoxification

06/14/2023
by   Griffin Floto, et al.
0

Text detoxification is a conditional text generation task aiming to remove offensive content from toxic text. It is highly useful for online forums and social media, where offensive content is frequently encountered. Intuitively, there are diverse ways to detoxify sentences while preserving their meanings, and we can select from detoxified sentences before displaying text to users. Conditional diffusion models are particularly suitable for this task given their demonstrated higher generative diversity than existing conditional text generation models based on language models. Nonetheless, text fluency declines when they are trained with insufficient data, which is the case for this task. In this work, we propose DiffuDetox, a mixed conditional and unconditional diffusion model for text detoxification. The conditional model takes toxic text as the condition and reduces its toxicity, yielding a diverse set of detoxified sentences. The unconditional model is trained to recover the input text, which allows the introduction of additional fluent text for training and thus ensures text fluency. Extensive experimental results and in-depth analysis demonstrate the effectiveness of our proposed DiffuDetox.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2023

RenderDiffusion: Text Generation as Image Generation

Diffusion models have become a new generative paradigm for text generati...
research
03/12/2023

Diffusion Models for Non-autoregressive Text Generation: A Survey

Non-autoregressive (NAR) text generation has attracted much attention in...
research
02/11/2023

A Reparameterized Discrete Diffusion Model for Text Generation

This work studies discrete diffusion probabilistic models with applicati...
research
09/11/2019

CTRL: A Conditional Transformer Language Model for Controllable Generation

Large-scale language models show promising text generation capabilities,...
research
01/02/2021

The Truth is Out There: Investigating Conspiracy Theories in Text Generation

With the growing adoption of text generation models in today's society, ...
research
11/10/2019

Stylized Text Generation Using Wasserstein Autoencoders with a Mixture of Gaussian Prior

Wasserstein autoencoders are effective for text generation. They do not ...
research
10/01/2019

TMLab: Generative Enhanced Model (GEM) for adversarial attacks

We present our Generative Enhanced Model (GEM) that we used to create sa...

Please sign up or login with your details

Forgot password? Click here to reset