Improving Question Answering with Generation of NQ-like Questions

Question Answering (QA) systems require a large amount of annotated data which is costly and time-consuming to gather. Converting datasets of existing QA benchmarks are challenging due to different formats and complexities. To address these issues, we propose an algorithm to automatically generate shorter questions resembling day-to-day human communication in the Natural Questions (NQ) dataset from longer trivia questions in Quizbowl (QB) dataset by leveraging conversion in style among the datasets. This provides an automated way to generate more data for our QA systems. To ensure quality as well as quantity of data, we detect and remove ill-formed questions using a neural classifier. We demonstrate that in a low resource setting, using the generated data improves the QA performance over the baseline system on both NQ and QB data. Our algorithm improves the scalability of training data while maintaining quality of data for QA systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Unsupervised Multi-hop Question Answering by Question Generation

Obtaining training data for Multi-hop Question Answering (QA) is extreme...
research
08/13/2019

Generative Question Refinement with Deep Reinforcement Learning in Retrieval-based QA System

In real-world question-answering (QA) systems, ill-formed questions, suc...
research
10/19/2020

Understanding Unnatural Questions Improves Reasoning over Text

Complex question answering (CQA) over raw text is a challenging task. A ...
research
05/25/2022

Asking the Right Questions in Low Resource Template Extraction

Information Extraction (IE) researchers are mapping tasks to Question An...
research
05/12/2023

Implications of Deep Circuits in Improving Quality of Quantum Question Answering

Question Answering (QA) has proved to be an arduous challenge in the are...
research
01/02/2021

Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering

Many Question-Answering (QA) datasets contain unanswerable questions, bu...
research
09/08/2022

Automated Validation of Insurance Applications against Calculation Specifications

Insurance companies rely on their Legacy Insurance System (LIS) to gover...

Please sign up or login with your details

Forgot password? Click here to reset