HeySQuAD: A Spoken Question Answering Dataset

04/26/2023
by   Yijing Wu, et al.
0

Human-spoken questions are critical to evaluating the performance of spoken question answering (SQA) systems that serve several real-world use cases including digital assistants. We present a new large-scale community-shared SQA dataset, HeySQuAD that consists of 76k human-spoken questions and 97k machine-generated questions and corresponding textual answers derived from the SQuAD QA dataset. The goal of HeySQuAD is to measure the ability of machines to understand noisy spoken questions and answer the questions accurately. To this end, we run extensive benchmarks on the human-spoken and machine-generated questions to quantify the differences in noise from both sources and its subsequent impact on the model and answering accuracy. Importantly, for the task of SQA, where we want to answer human-spoken questions, we observe that training using the transcribed human-spoken and original SQuAD questions leads to significant improvements (12.51 SQuAD textual questions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2020

Towards Data Distillation for End-to-end Spoken Conversational Question Answering

In spoken question answering, QA systems are designed to answer question...
research
09/26/2019

Spoken Conversational Search for General Knowledge

We present a spoken conversational question answering proof of concept t...
research
09/28/2020

Joint Spatio-Textual Reasoning for Answering Tourism Questions

Our goal is to answer real-world tourism questions that seek Points-of-I...
research
06/09/2020

ConfNet2Seq: Full Length Answer Generation from Spoken Questions

Conversational and task-oriented dialogue systems aim to interact with t...
research
05/27/2023

Answering Unanswered Questions through Semantic Reformulations in Spoken QA

Spoken Question Answering (QA) is a key feature of voice assistants, usu...
research
08/31/2016

Measuring Machine Intelligence Through Visual Question Answering

As machines have become more intelligent, there has been a renewed inter...
research
06/15/2019

Technical Report: Optimizing Human Involvement for Entity Matching and Consolidation

An end-to-end data integration system requires human feedback in several...

Please sign up or login with your details

Forgot password? Click here to reset