online_dialog_eval
Online Dialog Evaluation Metric - ACL Submission
view repo
Evaluating the quality of a dialogue interaction between two agents is a difficult task, especially in open-domain chit-chat style dialogue. There have been recent efforts to develop automatic dialogue evaluation metrics, but most of them do not generalize to unseen datasets and/or need a human-generated reference response during inference, making it infeasible for online evaluation. Here, we propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances, and leverages the temporal transitions that exist between them. We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
READ FULL TEXT
Despite advances in open-domain dialogue systems, automatic evaluation o...
read it
Is chatbot able to completely replace the human agent? The short answer ...
read it
Many automatic evaluation metrics have been proposed to score the overal...
read it
Automatic dialogue response evaluator has been proposed as an alternativ...
read it
Evaluating open-domain dialogue systems is difficult due to the diversit...
read it
Automatically evaluating text-based, non-task-oriented dialogue systems
...
read it
Automatic evaluating the performance of Open-domain dialogue system is a...
read it
Online Dialog Evaluation Metric - ACL Submission
Comments
There are no comments yet.