On the Impact of Speech Recognition Errors in Passage Retrieval for Spoken Question Answering

09/26/2022
by   Georgios Sidiropoulos, et al.
0

Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models introduce, the passage retrieval part remains unexplored. However, such errors can affect the performance of passage retrieval, leading to inferior end-to-end performance. To address this gap, we augment two existing large-scale passage ranking and open domain QA datasets with synthetic ASR noise and study the robustness of lexical and dense retrievers against questions with ASR noise. Furthermore, we study the generalizability of data augmentation techniques across different domains; with each domain being a different language dialect or accent. Finally, we create a new dataset with questions voiced by human users and use their transcriptions to show that the retrieval performance can further degrade when dealing with natural ASR noise instead of synthetic ASR noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/07/2018

ODSQA: Open-domain Spoken Question Answering Dataset

Reading comprehension by machine has been widely studied, but machine co...
research
09/24/2021

SD-QA: Spoken Dialectal Question Answering for the Real World

Question answering (QA) systems are now available through numerous comme...
research
09/04/2023

AVATAR: Robust Voice Search Engine Leveraging Autoregressive Document Retrieval and Contrastive Learning

Voice, as input, has progressively become popular on mobiles and seems t...
research
08/08/2019

Mitigating Noisy Inputs for Question Answering

Natural language processing systems are often downstream of unreliable i...
research
09/23/2021

Can Question Generation Debias Question Answering Models? A Case Study on Question-Context Lexical Overlap

Question answering (QA) models for reading comprehension have been demon...
research
09/03/2023

Generative Data Augmentation using LLMs improves Distributional Robustness in Question Answering

Robustness in Natural Language Processing continues to be a pertinent is...
research
10/21/2020

Knowledge Distillation for Improved Accuracy in Spoken Question Answering

Spoken question answering (SQA) is a challenging task that requires the ...

Please sign up or login with your details

Forgot password? Click here to reset