CNN-based Local Vision Transformer for COVID-19 Diagnosis

07/05/2022
by   Hongyan Xu, et al.
0

Deep learning technology can be used as an assistive technology to help doctors quickly and accurately identify COVID-19 infections. Recently, Vision Transformer (ViT) has shown great potential towards image classification due to its global receptive field. However, due to the lack of inductive biases inherent to CNNs, the ViT-based structure leads to limited feature richness and difficulty in model training. In this paper, we propose a new structure called Transformer for COVID-19 (COVT) to improve the performance of ViT-based architectures on small COVID-19 datasets. It uses CNN as a feature extractor to effectively extract local structural information, and introduces average pooling to ViT's Multilayer Perception(MLP) module for global information. Experiments show the effectiveness of our method on the two COVID-19 datasets and the ImageNet dataset.

READ FULL TEXT
research
12/10/2021

LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) aims to learn object locali...
research
03/12/2021

Vision Transformer for COVID-19 CXR Diagnosis using Chest X-ray Feature Corpus

Under the global COVID-19 crisis, developing robust diagnosis algorithm ...
research
06/02/2022

CVM-Cervix: A Hybrid Cervical Pap-Smear Image Classification Framework Using CNN, Visual Transformer and Multilayer Perceptron

Cervical cancer is the seventh most common cancer among all the cancers ...
research
02/18/2023

Hyneter: Hybrid Network Transformer for Object Detection

In this paper, we point out that the essential differences between CNN-b...
research
06/30/2022

PVT-COV19D: Pyramid Vision Transformer for COVID-19 Diagnosis

With the outbreak of COVID-19, a large number of relevant studies have e...
research
12/27/2021

Vision Transformer for Small-Size Datasets

Recently, the Vision Transformer (ViT), which applied the transformer st...
research
11/20/2022

Real-time Local Feature with Global Visual Information Enhancement

Local feature provides compact and invariant image representation for va...

Please sign up or login with your details

Forgot password? Click here to reset