HarperValleyBank: A Domain-Specific Spoken Dialog Corpus

10/26/2020
by   Mike Wu, et al.
9

We introduce HarperValleyBank, a free, public domain spoken dialog corpus. The data simulate simple consumer banking interactions, containing about 23 hours of audio from 1,446 human-human conversations between 59 unique speakers. We selected intents and utterance templates to allow realistic variation while controlling overall task complexity and limiting vocabulary size to about 700 unique words. We provide audio data along with transcripts and annotations for speaker ID, caller intent, dialog actions, and emotional valence. The size and domain specificity of this data makes for quick experiments with modern end-to-end neural approaches. Further, we provide baselines for representation learning and transfer tasks. These experiments adapt recent work to embed utterances and use the resulting representations in prediction tasks. Our experiments show that tasks using our annotations are sensitive to both the model choice and corpus size for representation learning approaches.

READ FULL TEXT
research
08/18/2021

Integrating Dialog History into End-to-End Spoken Language Understanding Systems

End-to-end spoken language understanding (SLU) systems that process huma...
research
05/01/2023

Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History

Most human interactions occur in the form of spoken conversations where ...
research
12/05/2016

Mapping the Dialog Act Annotations of the LEGO Corpus into the Communicative Functions of ISO 24617-2

In this paper we present strategies for mapping the dialog act annotatio...
research
08/30/2019

Dialog Intent Induction with Deep Multi-View Clustering

We introduce the dialog intent induction task and present a novel deep m...
research
12/20/2018

Context, Attention and Audio Feature Explorations for Audio Visual Scene-Aware Dialog

With the recent advancements in AI, Intelligent Virtual Assistants (IVA)...
research
05/11/2018

Bootstrapping Multilingual Intent Models via Machine Translation for Dialog Automation

With the resurgence of chat-based dialog systems in consumer and enterpr...
research
12/04/2019

A Resource for Computational Experiments on Mapudungun

We present a resource for computational experiments on Mapudungun, a pol...

Please sign up or login with your details

Forgot password? Click here to reset