Facilitating NSFW Text Detection in Open-Domain Dialogue Systems via Knowledge Distillation

09/18/2023
by   Huachuan Qiu, et al.
0

NSFW (Not Safe for Work) content, in the context of a dialogue, can have severe side effects on users in open-domain dialogue systems. However, research on detecting NSFW language, especially sexually explicit content, within a dialogue context has significantly lagged behind. To address this issue, we introduce CensorChat, a dialogue monitoring dataset aimed at NSFW dialogue detection. Leveraging knowledge distillation techniques involving GPT-4 and ChatGPT, this dataset offers a cost-effective means of constructing NSFW content detectors. The process entails collecting real-life human-machine interaction data and breaking it down into single utterances and single-turn dialogues, with the chatbot delivering the final utterance. ChatGPT is employed to annotate unlabeled data, serving as a training set. Rationale validation and test sets are constructed using ChatGPT and GPT-4 as annotators, with a self-criticism strategy for resolving discrepancies in labeling. A BERT model is fine-tuned as a text classifier on pseudo-labeled data, and its performance is assessed. The study emphasizes the importance of AI systems prioritizing user safety and well-being in digital conversations while respecting freedom of expression. The proposed approach not only advances NSFW content detection but also aligns with evolving user protection needs in AI-driven dialogues.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2021

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling

This paper presents a novel knowledge distillation method for dialogue s...
research
09/20/2020

Dialogue Distillation: Open-domain Dialogue Augmentation Using Unpaired Data

Recent advances in open-domain dialogue systems rely on the success of n...
research
03/21/2023

Heterogeneous-Branch Collaborative Learning for Dialogue Generation

With the development of deep learning, advanced dialogue generation meth...
research
05/12/2022

A Chit-Chats Enhanced Task-Oriented Dialogue Corpora for Fuse-Motive Conversation Systems

The goal of building intelligent dialogue systems has largely been separ...
research
05/28/2020

ConCET: Entity-Aware Topic Classification for Open-Domain Conversational Agents

Identifying the topic (domain) of each user's utterance in open-domain c...
research
04/30/2022

Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models

Recent open-domain dialogue models have brought numerous breakthroughs. ...
research
04/27/2022

An End-to-End Dialogue Summarization System for Sales Calls

Summarizing sales calls is a routine task performed manually by salespeo...

Please sign up or login with your details

Forgot password? Click here to reset