A Review of Evaluation Techniques for Social Dialogue Systems

09/13/2017
by   Amanda Cercas Curry, et al.
0

In contrast with goal-oriented dialogue, social dialogue has no clear measure of task success. Consequently, evaluation of these systems is notoriously hard. In this paper, we review current evaluation methods, focusing on automatic metrics. We conclude that turn-based metrics often ignore the context and do not account for the fact that several replies are valid, while end-of-dialogue rewards are mainly hand-crafted. Both lack grounding in human perceptions.

READ FULL TEXT

page 1

page 2

research
05/06/2021

Assessing Dialogue Systems with Distribution Distances

An important aspect of developing dialogue systems is how to evaluate an...
research
09/06/2019

ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons

While dialogue remains an important end-goal of natural language researc...
research
08/03/2021

How to Evaluate Your Dialogue Models: A Review of Approaches

Evaluating the quality of a dialogue system is an understudied problem. ...
research
09/29/2017

The First Evaluation of Chinese Human-Computer Dialogue Technology

In this paper, we introduce the first evaluation of Chinese human-comput...
research
09/14/2023

Exploring the Impact of Human Evaluator Group on Chat-Oriented Dialogue Evaluation

Human evaluation has been widely accepted as the standard for evaluating...
research
08/24/2020

How To Evaluate Your Dialogue System: Probe Tasks as an Alternative for Token-level Evaluation Metrics

Though generative dialogue modeling is widely seen as a language modelin...

Please sign up or login with your details

Forgot password? Click here to reset