SocialDial: A Benchmark for Socially-Aware Dialogue Systems

04/24/2023
by   Haolan Zhan, et al.
0

Dialogue systems have been widely applied in many scenarios and are now more powerful and ubiquitous than ever before. With large neural models and massive available data, current dialogue systems have access to more knowledge than any people in their life. However, current dialogue systems still do not perform at a human level. One major gap between conversational agents and humans lies in their abilities to be aware of social norms. The development of socially-aware dialogue systems is impeded due to the lack of resources. In this paper, we present the first socially-aware dialogue corpus - SocialDial, based on Chinese social culture. SocialDial consists of two parts: 1,563 multi-turn dialogues between two human speakers with fine-grained labels, and 4,870 synthetic conversations generated by ChatGPT. The human corpus covers five categories of social norms, which have 14 sub-categories in total. Specifically, it contains social factor annotations including social relation, context, social distance, and social norms. However, collecting sufficient socially-aware dialogues is costly. Thus, we harness the power of ChatGPT and devise an ontology-based synthetic data generation framework. This framework is able to generate synthetic data at scale. To ensure the quality of synthetic dialogues, we design several mechanisms for quality control during data collection. Finally, we evaluate our dataset using several pre-trained models, such as BERT and RoBERTa. Comprehensive empirical results based on state-of-the-art neural models demonstrate that modeling of social norms for dialogue systems is a promising research direction. To the best of our knowledge, SocialDial is the first socially-aware dialogue dataset that covers multiple social factors and has fine-grained labels.

READ FULL TEXT
research
11/17/2020

KddRES: A Multi-level Knowledge-driven Dialogue Dataset for Restaurant Towards Customized Dialogue System

Compared with CrossWOZ (Chinese) and MultiWOZ (English) dataset which ha...
research
11/01/2020

Social Chemistry 101: Learning to Reason about Social and Moral Norms

Social norms—the unspoken commonsense rules about acceptable social beha...
research
06/14/2023

LiveChat: A Large-Scale Personalized Dialogue Dataset Automatically Constructed from Live Streaming

Open-domain dialogue systems have made promising progress in recent year...
research
03/02/2021

Conversational Norms for Human-Robot Dialogues

This paper describes a recently initiated research project aiming at sup...
research
09/01/2021

M^2-MedDialog: A Dataset and Benchmarks for Multi-domain Multi-service Medical Dialogues

Medical dialogue systems (MDSs) aim to assist doctors and patients with ...
research
09/20/2021

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

To explore the limit of dialogue generation pre-training, we present the...
research
05/26/2023

NormBank: A Knowledge Bank of Situational Social Norms

We present NormBank, a knowledge bank of 155k situational norms. This re...

Please sign up or login with your details

Forgot password? Click here to reset