SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders

06/21/2022
by   Gang Li, et al.
0

Recently, significant progress has been made in masked image modeling to catch up to masked language modeling. However, unlike words in NLP, the lack of semantic decomposition of images still makes masked autoencoding (MAE) different between vision and language. In this paper, we explore a potential visual analogue of words, i.e., semantic parts, and we integrate semantic information into the training process of MAE by proposing a Semantic-Guided Masking strategy. Compared to widely adopted random masking, our masking strategy can gradually guide the network to learn various information, i.e., from intra-part patterns to inter-part relations. In particular, we achieve this in two steps. 1) Semantic part learning: we design a self-supervised part learning method to obtain semantic parts by leveraging and refining the multi-head attention of a ViT-based encoder. 2) Semantic-guided MAE (SemMAE) training: we design a masking strategy that varies from masking a portion of patches in each part to masking a portion of (whole) parts in an image. Extensive experiments on various vision tasks show that SemMAE can learn better image representation by integrating semantic information. In particular, SemMAE achieves 84.5 vanilla MAE by 1.4 tasks, SemMAE also brings significant improvements and yields the state-of-the-art performance.

READ FULL TEXT

page 3

page 4

page 8

research
11/28/2022

Good helper is around you: Attention-driven Masked Image Modeling

It has been witnessed that masked image modeling (MIM) has shown a huge ...
research
03/23/2022

What to Hide from Your Students: Attention-Guided Masked Image Modeling

Transformers and masked language modeling are quickly being adopted and ...
research
11/01/2022

RGMIM: Region-Guided Masked Image Modeling for COVID-19 Detection

Self-supervised learning has developed rapidly and also advances compute...
research
04/27/2022

Self-Supervised Learning of Object Parts for Semantic Segmentation

Progress in self-supervised learning has brought strong general image re...
research
02/28/2023

Efficient Masked Autoencoders with Self-Consistency

Inspired by masked language modeling (MLM) in natural language processin...
research
09/03/2021

Information Symmetry Matters: A Modal-Alternating Propagation Network for Few-Shot Learning

Semantic information provides intra-class consistency and inter-class di...
research
02/10/2016

Simple Search Algorithms on Semantic Networks Learned from Language Use

Recent empirical and modeling research has focused on the semantic fluen...

Please sign up or login with your details

Forgot password? Click here to reset