Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts

08/26/2021
by   Ashutosh Baheti, et al.
0

Dialogue models trained on human conversations inadvertently learn to generate offensive responses. Moreover, models can insult anyone by agreeing with an offensive context. To understand the dynamics of contextually offensive language, we study the stance of dialogue model responses in offensive Reddit conversations. Specifically, we crowd-annotate ToxiChat, a new dataset of 2,000 Reddit threads and model responses labeled with offensive language and stance. Our analysis reveals that 42 their agreement with safe comments (13 classifiers fine-tuned on our dataset achieve 0.71 F1 for offensive labels and 0.53 Macro-F1 for stance labels. Finally, we analyze some existing controllable text generation (CTG) methods to mitigate the contextual offensive behavior of dialogue models. Compared to the baseline, our best CTG model obtains a 19 reduction in agreement with offensive context and 29 responses. This highlights the need for future work to characterize and analyze more forms of inappropriate behavior in dialogue models to help make them safer. Our code and corpus are available at https://github.com/abaheti95/ToxiChat .

READ FULL TEXT

page 4

page 14

page 16

research
07/31/2023

A Benchmark for Understanding Dialogue Safety in Mental Health Support

Dialogue safety remains a pervasive challenge in open-domain human-machi...
research
12/04/2022

Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation

Large pretrained language models can easily produce toxic or biased cont...
research
04/22/2022

FaithDial: A Faithful Benchmark for Information-Seeking Dialogue

The goal of information-seeking dialogue is to respond to seeker queries...
research
08/27/2018

An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation

Generating semantically coherent responses is still a major challenge in...
research
09/27/2018

NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation

Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular...
research
08/25/2021

Viola: A Topic Agnostic Generate-and-Rank Dialogue System

We present Viola, an open-domain dialogue system for spoken conversation...
research
12/30/2020

Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness

Open-domain dialogue agents have vastly improved, but still confidently ...

Please sign up or login with your details

Forgot password? Click here to reset