Interview: A Large-Scale Open-Source Corpus of Media Dialog

Existing conversational datasets consist either of written proxies for dialog or small-scale transcriptions of natural speech. We introduce 'Interview': a large-scale (105K conversations) media dialog dataset collected from news interview transcripts. Compared to existing large-scale proxies for conversational data, language models trained on our dataset exhibit better zero-shot out-of-domain performance on existing spoken dialog datasets, demonstrating its usefulness in modeling real-world conversations. 'Interview' contains speaker role annotations for each turn, facilitating the development of engaging, responsive dialog systems. In fact, experiments on two dialog tasks show that leveraging such labels improves performance over strong speaker-agnostic baselines, and enabling models to generate more specific and inquisitive responses in interview-style conversations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2021

Generating Empathetic Responses with a Large Scale Dialog Dataset

The task of empathetic response generation aims at generating syntactica...
research
05/22/2022

AFEC: A Knowledge Graph Capturing Social Intelligence in Casual Conversations

This paper introduces AFEC, an automatically curated knowledge graph bas...
research
04/27/2023

q2d: Turning Questions into Dialogs to Teach Models How to Search

One of the exciting capabilities of recent language models for dialog is...
research
11/21/2022

CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog Evaluation

Practical dialog systems need to deal with various knowledge sources, no...
research
05/19/2020

Prototypical Q Networks for Automatic Conversational Diagnosis and Few-Shot New Disease Adaption

Spoken dialog systems have seen applications in many domains, including ...
research
12/15/2021

ErAConD : Error Annotated Conversational Dialog Dataset for Grammatical Error Correction

Currently available grammatical error correction (GEC) datasets are comp...
research
08/31/2023

Conversational Swarm Intelligence, a Pilot Study

Conversational Swarm Intelligence (CSI) is a new method for enabling lar...

Please sign up or login with your details

Forgot password? Click here to reset