Improving Masked Autoencoders by Learning Where to Mask

03/12/2023
by   Haijian Chen, et al.
0

Masked image modeling is a promising self-supervised learning method for visual data. It is typically built upon image patches with random masks, which largely ignores the variation of information density between them. The question is: Is there a better masking strategy than random sampling and how can we learn it? We empirically study this problem and initially find that introducing object-centric priors in mask sampling can significantly improve the learned representations. Inspired by this observation, we present AutoMAE, a fully differentiable framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process. In this way, our approach can adaptively find patches with higher information density for different images, and further strike a balance between the information gain obtained from image reconstruction and its practical training difficulty. In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.

READ FULL TEXT

page 2

page 3

page 4

page 6

page 12

page 13

page 14

research
06/01/2022

Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction

Self-supervised learning for computer vision has achieved tremendous pro...
research
03/10/2022

Manifold Modeling in Quotient Space: Learning An Invariant Mapping with Decodability of Image Patches

This study proposes a framework for manifold learning of image patches u...
research
05/23/2023

Difference-Masking: Choosing What to Mask in Continued Pretraining

Self-supervised learning (SSL) and the objective of masking-and-predicti...
research
06/09/2023

Exploring Effective Mask Sampling Modeling for Neural Image Compression

Image compression aims to reduce the information redundancy in images. M...
research
04/12/2023

Hard Patches Mining for Masked Image Modeling

Masked image modeling (MIM) has attracted much research attention due to...
research
01/14/2022

Time Series Generation with Masked Autoencoder

This paper shows that masked autoencoders with interpolators (InterpoMAE...
research
06/20/2020

Embodied Self-supervised Learning by Coordinated Sampling and Training

Self-supervised learning can significantly improve the performance of do...

Please sign up or login with your details

Forgot password? Click here to reset