Vision Transformer for Efficient Chest X-ray and Gastrointestinal Image Classification

04/23/2023
by   Smriti Regmi, et al.
12

Medical image analysis is a hot research topic because of its usefulness in different clinical applications, such as early disease diagnosis and treatment. Convolutional neural networks (CNNs) have become the de-facto standard in medical image analysis tasks because of their ability to learn complex features from the available datasets, which makes them surpass humans in many image-understanding tasks. In addition to CNNs, transformer architectures also have gained popularity for medical image analysis tasks. However, despite progress in the field, there are still potential areas for improvement. This study uses different CNNs and transformer-based methods with a wide range of data augmentation techniques. We evaluated their performance on three medical image datasets from different modalities. We evaluated and compared the performance of the vision transformer model with other state-of-the-art (SOTA) pre-trained CNN networks. For Chest X-ray, our vision transformer model achieved the highest F1 score of 0.9532, recall of 0.9533, Matthews correlation coefficient (MCC) of 0.9259, and ROC-AUC score of 0.97. Similarly, for the Kvasir dataset, we achieved an F1 score of 0.9436, recall of 0.9437, MCC of 0.9360, and ROC-AUC score of 0.97. For the Kvasir-Capsule (a large-scale VCE dataset), our ViT model achieved a weighted F1-score of 0.7156, recall of 0.7182, MCC of 0.3705, and ROC-AUC score of 0.57. We found that our transformer-based models were better or more effective than various CNN models for classifying different anatomical structures, findings, and abnormalities. Our model showed improvement over the CNN-based approaches and suggests that it could be used as a new benchmarking algorithm for algorithm development.

READ FULL TEXT

page 1

page 3

page 7

research
03/10/2021

TransMed: Transformers Advance Multi-modal Medical Image Classification

Over the past decade, convolutional neural networks (CNN) have shown ver...
research
05/05/2021

CUAB: Convolutional Uncertainty Attention Block Enhanced the Chest X-ray Image Analysis

In recent years, convolutional neural networks (CNNs) have been successf...
research
12/06/2020

Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification

Deep AUC Maximization (DAM) is a paradigm for learning a deep neural net...
research
02/17/2023

GPT4MIA: Utilizing Generative Pre-trained Transformer (GPT-3) as A Plug-and-Play Transductive Model for Medical Image Analysis

In this paper, we propose a novel approach (called GPT4MIA) that utilize...
research
03/02/2021

Using CNNs to Identify the Origin of Finger Vein Image

We study the finger vein (FV) sensor model identification task using a d...
research
08/14/2021

DICOM Imaging Router: An Open Deep Learning Framework for Classification of Body Parts from DICOM X-ray Scans

X-ray imaging in DICOM format is the most commonly used imaging modality...
research
06/07/2019

Classifying the reported ability in clinical mobility descriptions

Assessing how individuals perform different activities is key informatio...

Please sign up or login with your details

Forgot password? Click here to reset