Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

by   Dong Bok Lee, et al.

One of the most crucial challenges in questionanswering (QA) is the scarcity of labeled data,since it is costly to obtain question-answer(QA) pairs for a target text domain with human annotation. An alternative approach totackle the problem is to use automatically generated QA pairs from either the problem context or from large amount of unstructured texts(e.g. Wikipedia). In this work, we propose a hierarchical conditional variational autoencoder(HCVAE) for generating QA pairs given unstructured texts as contexts, while maximizingthe mutual information between generated QApairs to ensure their consistency. We validateourInformation MaximizingHierarchicalConditionalVariationalAutoEncoder (Info-HCVAE) on several benchmark datasets byevaluating the performance of the QA model(BERT-base) using only the generated QApairs (QA-based evaluation) or by using boththe generated and human-labeled pairs (semi-supervised learning) for training, against state-of-the-art baseline models. The results showthat our model obtains impressive performance gains over all baselines on both tasks,using only a fraction of data for training



There are no comments yet.


page 1

page 2

page 3

page 4


Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering

Text-based Question Generation (QG) aims at generating natural and relev...

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering

Clinical question answering (QA) aims to automatically answer questions ...

Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation

In this work, we propose a novel and easy-to-apply data augmentation str...

How to Build Robust FAQ Chatbot with Controllable Question Generator?

Many unanswerable adversarial questions fool the question-answer (QA) sy...

QA-Align: Representing Cross-Text Content Overlap by Aligning Question-Answer Propositions

Multi-text applications, such as multi-document summarization, are typic...

On the Generation of Medical Question-Answer Pairs

Question answering (QA) has achieved promising progress recently. Howeve...

Learning to Perturb Word Embeddings for Out-of-distribution QA

QA models based on pretrained language mod-els have achieved remarkable ...

Code Repositories


[ACL 2020] Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.