AViT: Adapting Vision Transformers for Small Skin Lesion Segmentation Datasets

by   Siyi Du, et al.

Skin lesion segmentation (SLS) plays an important role in skin lesion analysis. Vision transformers (ViTs) are considered an auspicious solution for SLS, but they require more training data compared to convolutional neural networks (CNNs) due to their inherent parameter-heavy structure and lack of some inductive biases. To alleviate this issue, current approaches fine-tune pre-trained ViT backbones on SLS datasets, aiming to leverage the knowledge learned from a larger set of natural images to lower the amount of skin training data needed. However, fully fine-tuning all parameters of large backbones is computationally expensive and memory intensive. In this paper, we propose AViT, a novel efficient strategy to mitigate ViTs' data-hunger by transferring any pre-trained ViTs to the SLS task. Specifically, we integrate lightweight modules (adapters) within the transformer layers, which modulate the feature representation of a ViT without updating its pre-trained weights. In addition, we employ a shallow CNN as a prompt generator to create a prompt embedding from the input image, which grasps fine-grained information and CNN's inductive biases to guide the segmentation task on small datasets. Our quantitative experiments on 4 skin lesion datasets demonstrate that AViT achieves competitive, and at times superior, performance to SOTA but with significantly fewer trainable parameters. Our code is available at https://github.com/siyi-wind/AViT.


Large-kernel Attention for Efficient and Robust Brain Lesion Segmentation

Vision transformers are effective deep learning models for vision tasks,...

MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation Datasets

Despite its clinical utility, medical image segmentation (MIS) remains a...

Investigating and Exploiting Image Resolution for Transfer Learning-based Skin Lesion Classification

Skin cancer is among the most common cancer types. Dermoscopic image ana...

Remote Sensing Change Detection With Transformers Trained from Scratch

Current transformer-based change detection (CD) approaches either employ...

EPVT: Environment-aware Prompt Vision Transformer for Domain Generalization in Skin Lesion Recognition

Skin lesion recognition using deep learning has made remarkable progress...

Boundary-aware Transformers for Skin Lesion Segmentation

Skin lesion segmentation from dermoscopy images is of great importance f...

Fine-tuning of explainable CNNs for skin lesion classification based on dermatologists' feedback towards increasing trust

In this paper, we propose a CNN fine-tuning method which enables users t...

Please sign up or login with your details

Forgot password? Click here to reset