Domain-Independent turn-level Dialogue Quality Evaluation via User Satisfaction Estimation

08/19/2019
by   Praveen Kumar Bodigutla, et al.
0

An automated metric to evaluate dialogue quality is vital for optimizing data driven dialogue management. The common approach of relying on explicit user feedback during a conversation is intrusive and sparse. Current models to estimate user satisfaction use limited feature sets and rely on annotation schemes with low inter-rater reliability, limiting generalizability to conversations spanning multiple domains. To address these gaps, we created a new Response Quality annotation scheme, based on which we developed turn-level User Satisfaction metric. We introduced five new domain-independent feature sets and experimented with six machine learning models to estimate the new satisfaction metric. Using Response Quality annotation scheme, across randomly sampled single and multi-turn conversations from 26 domains, we achieved high inter-annotator agreement (Spearman's rho 0.94). The Response Quality labels were highly correlated (0.76) with explicit turn-level user ratings. Gradient boosting regression achieved best correlation of 0.79 between predicted and annotated user satisfaction labels. Multi Layer Perceptron and Gradient Boosting regression models generalized to an unseen domain better (linear correlation 0.67) than other models. Finally, our ablation study verified that our novel features significantly improved model performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2019

Multi-domain Conversation Quality Evaluation via User Satisfaction Estimation

An automated metric to evaluate dialogue quality is vital for optimizing...
research
10/06/2020

Joint Turn and Dialogue level User Satisfaction Estimation on Multi-Domain Conversations

Dialogue level quality estimation is vital for optimizing data driven di...
research
03/01/2021

A Data-driven Approach to Estimate User Satisfaction in Multi-turn Dialogues

The evaluation of multi-turn dialogues remains challenging. The common a...
research
05/08/2021

Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems

Evaluation is crucial in the development process of task-oriented dialog...
research
12/05/2022

A Transformer-Based User Satisfaction Prediction for Proactive Interaction Mechanism in DuerOS

Recently, spoken dialogue systems have been widely deployed in a variety...
research
01/13/2021

Is the User Enjoying the Conversation? A Case Study on the Impact on the Reward Function

The impact of user satisfaction in policy learning task-oriented dialogu...
research
09/17/2021

A Role-Selected Sharing Network for Joint Machine-Human Chatting Handoff and Service Satisfaction Analysis

Chatbot is increasingly thriving in different domains, however, because ...

Please sign up or login with your details

Forgot password? Click here to reset