Semi-supervised Vision Transformers at Scale

08/11/2022
by   Zhaowei Cai, et al.
27

We study semi-supervised learning (SSL) for vision transformers (ViT), an under-explored topic despite the wide adoption of the ViT architectures to different tasks. To tackle this problem, we propose a new SSL pipeline, consisting of first un/self-supervised pre-training, followed by supervised fine-tuning, and finally semi-supervised fine-tuning. At the semi-supervised fine-tuning stage, we adopt an exponential moving average (EMA)-Teacher framework instead of the popular FixMatch, since the former is more stable and delivers higher accuracy for semi-supervised vision transformers. In addition, we propose a probabilistic pseudo mixup mechanism to interpolate unlabeled samples and their pseudo labels for improved regularization, which is important for training ViTs with weak inductive bias. Our proposed method, dubbed Semi-ViT, achieves comparable or better performance than the CNN counterparts in the semi-supervised classification setting. Semi-ViT also enjoys the scalability benefits of ViTs that can be readily scaled up to large-size models with increasing accuracies. For example, Semi-ViT-Huge achieves an impressive 80 Inception-v4 using 100

READ FULL TEXT
research
06/16/2022

Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training

Recent studies have shown that the benefits provided by self-supervised ...
research
11/22/2021

Semi-Supervised Vision Transformers

We study the training of Vision Transformers for semi-supervised image c...
research
09/15/2022

On the Surprising Effectiveness of Transformers in Low-Labeled Video Recognition

Recently vision transformers have been shown to be competitive with conv...
research
01/04/2023

Semi-MAE: Masked Autoencoders for Semi-supervised Vision Transformers

Vision Transformer (ViT) suffers from data scarcity in semi-supervised l...
research
06/26/2021

OffRoadTranSeg: Semi-Supervised Segmentation using Transformers on OffRoad environments

We present OffRoadTranSeg, the first end-to-end framework for semi-super...
research
05/30/2020

Semi-Supervised Fine-Tuning for Deep Learning Models in Remote Sensing Applications

A combinatory approach of two well-known fields: deep learning and semi ...
research
05/16/2023

NightHazeFormer: Single Nighttime Haze Removal Using Prior Query Transformer

Nighttime image dehazing is a challenging task due to the presence of mu...

Please sign up or login with your details

Forgot password? Click here to reset