Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification Segmentation

07/07/2023
by   Dahyun Kang, et al.
0

We address the task of weakly-supervised few-shot image classification and segmentation, by leveraging a Vision Transformer (ViT) pretrained with self-supervision. Our proposed method takes token representations from the self-supervised ViT and leverages their correlations, via self-attention, to produce classification and segmentation predictions through separate task heads. Our model is able to effectively learn to perform classification and segmentation in the absence of pixel-level labels during training, using only image-level labels. To do this it uses attention maps, created from tokens generated by the self-supervised ViT backbone, as pixel-level pseudo-labels. We also explore a practical setup with “mixed" supervision, where a small number of training images contains ground-truth pixel-level labels and the remaining images have only image-level labels. For this mixed setup, we propose to improve the pseudo-labels using a pseudo-label enhancer that was trained using the available ground-truth pixel-level labels. Experiments on Pascal-5i and COCO-20i demonstrate significant performance gains in a variety of supervision settings, and in particular when little-to-no pixel-level labels are available.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 8

page 15

page 17

research
07/02/2021

Mixed Supervision Learning for Whole Slide Image Classification

Weak supervision learning on classification labels has demonstrated high...
research
11/02/2021

A Pixel-Level Meta-Learner for Weakly Supervised Few-Shot Semantic Segmentation

Few-shot semantic segmentation addresses the learning task in which only...
research
03/19/2022

Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and Semi-Supervised Semantic Segmentation

Semantic segmentation with limited annotations, such as weakly supervise...
research
05/02/2022

Boosting Video Object Segmentation based on Scale Inconsistency

We present a refinement framework to boost the performance of pre-traine...
research
11/26/2022

Human-machine Interactive Tissue Prototype Learning for Label-efficient Histopathology Image Segmentation

Recently, deep neural networks have greatly advanced histopathology imag...
research
08/22/2023

Boundary-RL: Reinforcement Learning for Weakly-Supervised Prostate Segmentation in TRUS Images

We propose Boundary-RL, a novel weakly supervised segmentation method th...
research
08/22/2023

Food Image Classification and Segmentation with Attention-based Multiple Instance Learning

The demand for accurate food quantification has increased in the recent ...

Please sign up or login with your details

Forgot password? Click here to reset