HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations

06/01/2021
by   Weixin Liang, et al.
0

Open-domain dialog systems have a user-centric goal: to provide humans with an engaging conversation experience. User engagement is one of the most important metrics for evaluating open-domain dialog systems, and could also be used as real-time feedback to benefit dialog policy learning. Existing work on detecting user disengagement typically requires hand-labeling many dialog samples. We propose HERALD, an efficient annotation framework that reframes the training data annotation process as a denoising problem. Specifically, instead of manually labeling training samples, we first use a set of labeling heuristics to label training samples automatically. We then denoise the weakly labeled data using the Shapley algorithm. Finally, we use the denoised data to train a user engagement detector. Our experiments show that HERALD improves annotation efficiency significantly and achieves 86 detection accuracy in two dialog corpora.

READ FULL TEXT
research
01/11/2017

RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems

Open-domain human-computer conversation has been attracting increasing a...
research
09/18/2019

Leveraging User Engagement Signals For Entity Labeling in a Virtual Assistant

Personal assistant AI systems such as Siri, Cortana, and Alexa have beco...
research
08/27/2019

MIDAS: A Dialog Act Annotation Scheme for Open Domain Human Machine Spoken Conversations

Dialog act prediction is an essential language comprehension task for bo...
research
09/17/2020

Deploying machine learning to assist digital humanitarians: making image annotation in OpenStreetMap more efficient

Locating populations in rural areas of developing countries has attracte...
research
02/21/2022

A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets

Intent classifiers are vital to the successful operation of virtual agen...
research
11/25/2019

Filling Conversation Ellipsis for Better Social Dialog Understanding

The phenomenon of ellipsis is prevalent in social conversations. Ellipsi...
research
08/16/2022

TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation

Collecting and annotating task-oriented dialog data is difficult, especi...

Please sign up or login with your details

Forgot password? Click here to reset