Multi-domain Conversation Quality Evaluation via User Satisfaction Estimation

11/18/2019
by   Praveen Kumar Bodigutla, et al.
0

An automated metric to evaluate dialogue quality is vital for optimizing data driven dialogue management. The common approach of relying on explicit user feedback during a conversation is intrusive and sparse. Current models to estimate user satisfaction use limited feature sets and employ annotation schemes with limited generalizability to conversations spanning multiple domains. To address these gaps, we created a new Response Quality annotation scheme, introduced five new domain-independent feature sets and experimented with six machine learning models to estimate User Satisfaction at both turn and dialogue level. Response Quality ratings achieved significantly high correlation (0.76) with explicit turn-level user ratings. Using the new feature sets we introduced, Gradient Boosting Regression model achieved best (rating [1-5]) prediction performance on 26 seen (linear correlation  0.79) and one new multi-turn domain (linear correlation 0.67). We observed a 16 in binary ("satisfactory/dissatisfactory") class prediction accuracy of a domain-independent dialogue-level satisfaction estimation model after including predicted turn-level satisfaction ratings as features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2019

Domain-Independent turn-level Dialogue Quality Evaluation via User Satisfaction Estimation

An automated metric to evaluate dialogue quality is vital for optimizing...
research
10/06/2020

Joint Turn and Dialogue level User Satisfaction Estimation on Multi-Domain Conversations

Dialogue level quality estimation is vital for optimizing data driven di...
research
03/01/2021

A Data-driven Approach to Estimate User Satisfaction in Multi-turn Dialogues

The evaluation of multi-turn dialogues remains challenging. The common a...
research
10/31/2021

What Went Wrong? Explaining Overall Dialogue Quality through Utterance-Level Impacts

Improving user experience of a dialogue system often requires intensive ...
research
04/26/2022

Understanding User Satisfaction with Task-oriented Dialogue Systems

Dialogue systems are evaluated depending on their type and purpose. Two ...
research
12/05/2022

A Transformer-Based User Satisfaction Prediction for Proactive Interaction Mechanism in DuerOS

Recently, spoken dialogue systems have been widely deployed in a variety...
research
06/10/2021

Curiously Effective Features for Image Quality Prediction

The performance of visual quality prediction models is commonly assumed ...

Please sign up or login with your details

Forgot password? Click here to reset