SkinDistilViT: Lightweight Vision Transformer for Skin Lesion Classification

Skin cancer is a treatable disease if discovered early. We provide a production-specific solution to the skin cancer classification problem that matches human performance in melanoma identification by training a vision transformer on melanoma medical images annotated by experts. Since inference cost, both time and memory wise is important in practice, we employ knowledge distillation to obtain a model that retains 98.33 multi-class accuracy, at a fraction of the cost. Memory-wise, our model is 49.60 GPU and 97.96 the transformer and employing a cascading distillation process, we improve the balanced multi-class accuracy of the base model by 2.1 of models of various sizes but comparable performance. We provide the code at https://github.com/Longman-Stan/SkinDistilVit.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset