Doc2Bot: Accessing Heterogeneous Documents via Conversational Bots

10/20/2022
by   Haomin Fu, et al.
0

This paper introduces Doc2Bot, a novel dataset for building machines that help users seek information via conversations. This is of particular interest for companies and organizations that own a large number of manuals or instruction books. Despite its potential, the nature of our task poses several challenges: (1) documents contain various structures that hinder the ability of machines to comprehend, and (2) user information needs are often underspecified. Compared to prior datasets that either focus on a single structural type or overlook the role of questioning to uncover user needs, the Doc2Bot dataset is developed to target such challenges systematically. Our dataset contains over 100,000 turns based on Chinese documents from five domains, larger than any prior document-grounded dialog dataset for information seeking. We propose three tasks in Doc2Bot: (1) dialog state tracking to track user intentions, (2) dialog policy learning to plan system actions and contents, and (3) response generation which generates responses based on the outputs of the dialog policy. Baseline methods based on the latest deep learning models are presented, indicating that our proposed tasks are challenging and worthy of further research.

READ FULL TEXT

page 2

page 6

research
06/17/2022

CookDial: A dataset for task-oriented dialogs grounded in procedural documents

This work presents a new dialog dataset, CookDial, that facilitates rese...
research
09/19/2018

A Dataset for Document Grounded Conversations

This paper introduces a document grounded dataset for text conversations...
research
07/28/2022

Interactive Evaluation of Dialog Track at DSTC9

The ultimate goal of dialog research is to develop systems that can be e...
research
09/03/2019

CMU GetGoing: An Understandable and Memorable Dialog System for Seniors

Voice-based technologies are typically developed for the average user, a...
research
10/05/2022

CorefDiffs: Co-referential and Differential Knowledge Flow in Document Grounded Conversations

Knowledge-grounded dialog systems need to incorporate smooth transitions...
research
05/01/2020

Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity

Open-ended human learning and information-seeking are increasingly media...
research
05/06/2020

Building A User-Centric and Content-Driven Socialbot

To build Sounding Board, we develop a system architecture that is capabl...

Please sign up or login with your details

Forgot password? Click here to reset