DeepAI AI Chat
Log In Sign Up

emrQA: A Large Corpus for Question Answering on Electronic Medical Records

09/03/2018
by   Anusri Pampari, et al.
ibm
University of Illinois at Urbana-Champaign
0

We propose a novel methodology to generate domain-specific large-scale question answering (QA) datasets by re-purposing existing annotations for other NLP tasks. We demonstrate an instance of this methodology in generating a large-scale QA dataset for electronic medical records by leveraging existing expert annotations on clinical notes for various NLP tasks from the community shared i2b2 datasets. The resulting corpus (emrQA) has 1 million question-logical form and 400,000+ question-answer evidence pairs. We characterize the dataset and explore its learning potential by training baseline models for question to logical form and question to answer mapping.

READ FULL TEXT
06/30/2022

Modern Question Answering Datasets and Benchmarks: A Survey

Question Answering (QA) is one of the most important natural language pr...
05/17/2018

Annotating Electronic Medical Records for Question Answering

Our research is in the relatively unexplored area of question answering ...
04/26/2020

MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization

Recently, large-scale datasets have vastly facilitated the development i...
05/13/2020

Entity-Enriched Neural Models for Clinical Question Answering

We explore state-of-the-art neural models for question answering on elec...
06/06/2022

Learning to Ask Like a Physician

Existing question answering (QA) datasets derived from electronic health...
11/08/2022

Toward a Neural Semantic Parsing System for EHR Question Answering

Clinical semantic parsing (SP) is an important step toward identifying t...
10/30/2020

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering

Clinical question answering (QA) aims to automatically answer questions ...