Learning to Ask Like a Physician

06/06/2022
by   Eric Lehman, et al.
0

Existing question answering (QA) datasets derived from electronic health records (EHR) are artificially generated and consequently fail to capture realistic physician information needs. We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ questions paired with the snippets of text (triggers) that prompted each question. The questions are generated by medical experts from 100+ MIMIC-III discharge summaries. We analyze this dataset to characterize the types of information sought by medical experts. We also train baseline models for trigger detection and question generation (QG), paired with unsupervised answer retrieval over EHRs. Our baseline model is able to generate high quality questions in over 62 release this dataset (and all code to reproduce baseline model results) to facilitate further research into realistic clinical QA and QG: https://github.com/elehman16/discq.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2018

emrQA: A Large Corpus for Question Answering on Electronic Medical Records

We propose a novel methodology to generate domain-specific large-scale q...
research
08/31/2019

Let's Ask Again: Refine Network for Automatic Question Generation

In this work, we focus on the task of Automatic Question Generation (AQG...
research
05/02/2023

Huatuo-26M, a Large-scale Chinese Medical QA Dataset

In this paper, we release a largest ever medical Question Answering (QA)...
research
09/13/2019

PubMedQA: A Dataset for Biomedical Research Question Answering

We introduce PubMedQA, a novel biomedical question answering (QA) datase...
research
03/01/2023

DIFFQG: Generating Questions to Summarize Factual Changes

Identifying the difference between two versions of the same article is u...
research
09/19/2020

Can questions summarize a corpus? Using question generation for characterizing COVID-19 research

What are the latent questions on some textual data? In this work, we inv...
research
05/21/2023

TheoremQA: A Theorem-driven Question Answering dataset

The recent LLMs like GPT-4 and PaLM-2 have made tremendous progress in s...

Please sign up or login with your details

Forgot password? Click here to reset