RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

06/28/2021
by   Saahil Jain, et al.
17

Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a development dataset, which contains board-certified radiologist annotations for 500 radiology reports from the MIMIC-CXR dataset (14,579 entities and 10,889 relations), and a test dataset, which contains two independent sets of board-certified radiologist annotations for 100 radiology reports split equally across the MIMIC-CXR and CheXpert datasets. Using these datasets, we train and test a deep learning model, RadGraph Benchmark, that achieves a micro F1 of 0.82 and 0.73 on relation extraction on the MIMIC-CXR and CheXpert test sets respectively. Additionally, we release an inference dataset, which contains annotations automatically generated by RadGraph Benchmark across 220,763 MIMIC-CXR reports (around 6 million entities and 4 million relations) and 500 CheXpert reports (13,783 entities and 9,908 relations) with mappings to associated chest radiographs. Our freely available dataset can facilitate a wide range of research in medical natural language processing, as well as computer vision and multi-modal learning when linked to chest radiographs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/21/2019

MIMIC-CXR: A large publicly available database of labeled chest radiographs

Chest radiography is an extremely powerful imaging modality, allowing fo...
research
12/27/2021

Event-based clinical findings extraction from radiology reports with pre-trained language model

Radiology reports contain a diverse and rich set of clinical abnormaliti...
research
03/10/2021

Identifying ARDS using the Hierarchical Attention Network with Sentence Objectives Framework

Acute respiratory distress syndrome (ARDS) is a life-threatening conditi...
research
03/17/2023

STIXnet: A Novel and Modular Solution for Extracting All STIX Objects in CTI Reports

The automatic extraction of information from Cyber Threat Intelligence (...
research
06/14/2023

Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports

Despite the reduction in turn-around times in radiology reports with the...
research
11/08/2021

JaMIE: A Pipeline Japanese Medical Information Extraction System

We present an open-access natural language processing toolkit for Japane...

Please sign up or login with your details

Forgot password? Click here to reset