Modeling Performance in Open-Domain Dialogue with PARADISE

10/21/2021
by   Marilyn Walker, et al.
10

There has recently been an explosion of work on spoken dialogue systems, along with an increased interest in open-domain systems that engage in casual conversations on popular topics such as movies, books and music. These systems aim to socially engage, entertain, and even empathize with their users. Since the achievement of such social goals is hard to measure, recent research has used dialogue length or human ratings as evaluation metrics, and developed methods for automatically calculating novel metrics, such as coherence, consistency, relevance and engagement. Here we develop a PARADISE model for predicting the performance of Athena, a dialogue system that has participated in thousands of conversations with real users, while competing as a finalist in the Alexa Prize. We use both user ratings and dialogue length as metrics for dialogue quality, and experiment with predicting these metrics using automatic features that are both system dependent and independent. Our goal is to learn a general objective function that can be used to optimize the dialogue choices of any Alexa Prize system in real time and evaluate its performance. Our best model for predicting user ratings gets an R^2 of .136 with a DistilBert model, and the best model for predicting length with system independent features gets an R^2 of .865, suggesting that conversation length may be a more reliable measure for automatic training of dialogue systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2021

Assessing Dialogue Systems with Distribution Distances

An important aspect of developing dialogue systems is how to evaluate an...
research
11/04/2019

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

User engagement is a critical metric for evaluating the quality of open-...
research
11/02/2018

Neural Response Ranking for Social Conversation: A Data-Efficient Approach

The overall objective of 'social' dialogue systems is to support engagin...
research
06/17/2020

Is this Dialogue Coherent? Learning from Dialogue Acts and Entities

In this work, we investigate the human perception of coherence in open-d...
research
09/23/2019

Towards Best Experiment Design for Evaluating Dialogue System Output

To overcome the limitations of automated metrics (e.g. BLEU, METEOR) for...
research
03/10/2023

Rewarding Chatbots for Real-World Engagement with Millions of Users

The emergence of pretrained large language models has led to the deploym...
research
08/07/2022

When can I Speak? Predicting initiation points for spoken dialogue agents

Current spoken dialogue systems initiate their turns after a long period...

Please sign up or login with your details

Forgot password? Click here to reset