FinChat: Corpus and evaluation setup for Finnish chat conversations on everyday topics

08/19/2020
by   Katri Leino, et al.
0

Creating open-domain chatbots requires large amounts of conversational data and related benchmark tasks to evaluate them. Standardized evaluation tasks are crucial for creating automatic evaluation metrics for model development; otherwise, comparing the models would require resource-expensive human evaluation. While chatbot challenges have recently managed to provide a plethora of such resources for English, resources in other languages are not yet available. In this work, we provide a starting point for Finnish open-domain chatbot research. We describe our collection efforts to create the Finnish chat conversation corpus FinChat, which is made available publicly. FinChat includes unscripted conversations on seven topics from people of different ages. Using this corpus, we also construct a retrieval-based evaluation task for Finnish chatbot development. We observe that off-the-shelf chatbot models trained on conversational corpora do not perform better than chance at choosing the right answer based on automatic metrics, while humans can do the same task almost perfectly. Similarly, in a human evaluation, responses to questions from the evaluation set generated by the chatbots are predominantly marked as incoherent. Thus, FinChat provides a challenging evaluation set, meant to encourage chatbot development in Finnish.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2020

Speaker Sensitive Response Evaluation Model

Automatic evaluation of open-domain dialogue response generation is very...
research
08/23/2023

Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations

Building socialbots that can have deep, engaging open-domain conversatio...
research
08/13/2021

Low-Resource Adaptation of Open-Domain Generative Chatbots

Recent work building open-domain chatbots has demonstrated that increasi...
research
01/12/2022

Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents

At the heart of improving conversational AI is the open problem of how t...
research
09/18/2018

Talking to myself: self-dialogues as data for conversational agents

Conversational agents are gaining popularity with the increasing ubiquit...
research
11/24/2022

How "open" are the conversations with open-domain chatbots? A proposal for Speech Event based evaluation

Open-domain chatbots are supposed to converse freely with humans without...
research
05/05/2022

Balancing Multi-Domain Corpora Learning for Open-Domain Response Generation

Open-domain conversational systems are assumed to generate equally good ...

Please sign up or login with your details

Forgot password? Click here to reset