Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

11/04/2019
by   Sarik Ghazarian, et al.
0

User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, predictive engagement, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can improve automatic evaluation metrics for open-domain dialogue systems, as shown by correlation with human judgements. This suggests that predictive engagement can be used as a real-time feedback for training better dialogue models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2021

Modeling Performance in Open-Domain Dialogue with PARADISE

There has recently been an explosion of work on spoken dialogue systems,...
research
10/22/2022

EnDex: Evaluation of Dialogue Engagingness at Scale

We propose EnDex, the first human-reaction based model to evaluate dialo...
research
10/31/2021

What Went Wrong? Explaining Overall Dialogue Quality through Utterance-Level Impacts

Improving user experience of a dialogue system often requires intensive ...
research
01/27/2020

Towards a Human-like Open-Domain Chatbot

We present Meena, a multi-turn open-domain chatbot trained end-to-end on...
research
08/18/2020

Deploying Lifelong Open-Domain Dialogue Learning

Much of NLP research has focused on crowdsourced static datasets and the...
research
07/31/2023

DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement Estimation in Conversation

Conversational engagement estimation is posed as a regression problem, e...
research
03/18/2022

DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations

Automatic evaluation metrics are essential for the rapid development of ...

Please sign up or login with your details

Forgot password? Click here to reset