MURA Dataset: Towards Radiologist-Level Abnormality Detection in Musculoskeletal Radiographs

by   Pranav Rajpurkar, et al.
Stanford University

We introduce MURA, a large dataset of musculoskeletal radiographs containing 40,895 images from 14,982 studies, where each study is manually labeled by radiologists as either normal or abnormal. On this dataset, we train a 169-layer densely connected convolutional network to detect and localize abnormalities. To evaluate our model robustly and to get an estimate of radiologist performance, we collect additional labels from board-certified Stanford radiologists on the test set, consisting of 209 musculoskeletal studies. We compared our model and radiologists on the Cohen's kappa statistic, which expresses the agreement of our model and of each radiologist with the gold standard, defined as the majority vote of a disjoint group of radiologists. We find that our model achieves performance comparable to that of radiologists. Model performance is higher than the best radiologist performance in detecting abnormalities on finger studies and equivalent on wrist studies. However, model performance is lower than best radiologist performance in detecting abnormalities on elbow, forearm, hand, humerus, and shoulder studies, indicating that the task is a good challenge for future research. To encourage advances, we have made our dataset freely available at


page 1

page 5


CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

Large, labeled datasets have driven deep learning methods to achieve exp...

COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval

We present a large challenging dataset, COUGH, for COVID-19 FAQ retrieva...

VinDr-SpineXR: A deep learning framework for spinal lesions detection and classification from radiographs

Radiographs are used as the most important imaging tool for identifying ...

Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

We develop an algorithm which exceeds the performance of board certified...

Evaluating NLP Models via Contrast Sets

Standard test sets for supervised learning evaluate in-distribution gene...

Code Repositories


An implementation of MURA Dataset Towards Radiologist-Level Abnormality Detection in Musculoskeletal Radiographs

view repo


Implementation of DenseNet model on MURA dataset using PyTorch

view repo

Please sign up or login with your details

Forgot password? Click here to reset