emrQA: A Large Corpus for Question Answering on Electronic Medical Records

09/03/2018
by   Anusri Pampari, et al.
0

We propose a novel methodology to generate domain-specific large-scale question answering (QA) datasets by re-purposing existing annotations for other NLP tasks. We demonstrate an instance of this methodology in generating a large-scale QA dataset for electronic medical records by leveraging existing expert annotations on clinical notes for various NLP tasks from the community shared i2b2 datasets. The resulting corpus (emrQA) has 1 million question-logical form and 400,000+ question-answer evidence pairs. We characterize the dataset and explore its learning potential by training baseline models for question to logical form and question to answer mapping.

READ FULL TEXT
research
06/30/2022

Modern Question Answering Datasets and Benchmarks: A Survey

Question Answering (QA) is one of the most important natural language pr...
research
05/17/2018

Annotating Electronic Medical Records for Question Answering

Our research is in the relatively unexplored area of question answering ...
research
04/26/2020

MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization

Recently, large-scale datasets have vastly facilitated the development i...
research
05/13/2020

Entity-Enriched Neural Models for Clinical Question Answering

We explore state-of-the-art neural models for question answering on elec...
research
06/06/2022

Learning to Ask Like a Physician

Existing question answering (QA) datasets derived from electronic health...
research
11/08/2022

Toward a Neural Semantic Parsing System for EHR Question Answering

Clinical semantic parsing (SP) is an important step toward identifying t...
research
10/30/2020

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering

Clinical question answering (QA) aims to automatically answer questions ...

Please sign up or login with your details

Forgot password? Click here to reset