Understanding Transfer Learning for Chest Radiograph Clinical Report Generation with Modified Transformer Architectures

05/05/2022
by   Edward Vendrow, et al.
0

The image captioning task is increasingly prevalent in artificial intelligence applications for medicine. One important application is clinical report generation from chest radiographs. The clinical writing of unstructured reports is time consuming and error-prone. An automated system would improve standardization, error reduction, time consumption, and medical accessibility. In this paper we demonstrate the importance of domain specific pre-training and propose a modified transformer architecture for the medical image captioning task. To accomplish this, we train a series of modified transformers to generate clinical reports from chest radiograph image input. These modified transformers include: a meshed-memory augmented transformer architecture with visual extractor using ImageNet pre-trained weights, a meshed-memory augmented transformer architecture with visual extractor using CheXpert pre-trained weights, and a meshed-memory augmented transformer whose encoder is passed the concatenated embeddings using both ImageNet pre-trained weights and CheXpert pre-trained weights. We use BLEU(1-4), ROUGE-L, CIDEr, and the clinical CheXbert F1 scores to validate our models and demonstrate competitive scores with state of the art models. We provide evidence that ImageNet pre-training is ill-suited for the medical image captioning task, especially for less frequent conditions (eg: enlarged cardiomediastinum, lung lesion, pneumothorax). Furthermore, we demonstrate that the double feature model improves performance for specific medical conditions (edema, consolidation, pneumothorax, support devices) and overall CheXbert F1 score, and should be further developed in future work. Such a double feature model, including both ImageNet pre-training as well as domain specific pre-training, could be used in a wide range of image captioning models in medicine.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2021

Scaling Up Vision-Language Pre-training for Image Captioning

In recent years, we have witnessed significant performance boost in the ...
research
02/11/2022

ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning

Recent research that applies Transformer-based architectures to image ca...
research
02/08/2023

Adapting Pre-trained Vision Transformers from 2D to 3D through Weight Inflation Improves Medical Image Segmentation

Given the prevalence of 3D medical imaging technologies such as MRI and ...
research
06/09/2023

FPDM: Domain-Specific Fast Pre-training Technique using Document-Level Metadata

Pre-training Transformers has shown promising results on open-domain and...
research
08/05/2022

RadTex: Learning Efficient Radiograph Representations from Text Reports

Automated analysis of chest radiography using deep learning has tremendo...
research
09/28/2022

Medical Image Captioning via Generative Pretrained Transformers

The automatic clinical caption generation problem is referred to as prop...
research
06/21/2022

Neural Transformers for Intraductal Papillary Mucosal Neoplasms (IPMN) Classification in MRI images

Early detection of precancerous cysts or neoplasms, i.e., Intraductal Pa...

Please sign up or login with your details

Forgot password? Click here to reset