Delving Deeper into Data Scaling in Masked Image Modeling

05/24/2023
by   Cheng-Ze Lu, et al.
0

Understanding whether self-supervised learning methods can scale with unlimited data is crucial for training large-scale models. In this work, we conduct an empirical study on the scaling capability of masked image modeling (MIM) methods (e.g., MAE) for visual recognition. Unlike most previous works that depend on the widely-used ImageNet dataset, which is manually curated and object-centric, we take a step further and propose to investigate this problem in a more practical setting. Specifically, we utilize the web-collected Coyo-700M dataset. We randomly sample varying numbers of training images from the Coyo dataset and construct a series of sub-datasets, containing 0.5M, 1M, 5M, 10M, and 100M images, for pre-training. Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models. The study reveals that: 1) MIM can be viewed as an effective method to improve the model capacity when the scale of the training data is relatively small; 2) Strong reconstruction targets can endow the models with increased capacities on downstream tasks; 3) MIM pre-training is data-agnostic under most scenarios, which means that the strategy of sampling pre-training data is non-critical. We hope these observations could provide valuable insights for future research on MIM.

READ FULL TEXT
research
06/09/2022

On Data Scaling in Masked Image Modeling

An important goal of self-supervised learning is to enable model pre-tra...
research
10/15/2021

Don't speak too fast: The impact of data bias on self-supervised speech models

Self-supervised Speech Models (S3Ms) have been proven successful in many...
research
12/09/2022

Benchmarking Self-Supervised Learning on Diverse Pathology Datasets

Computational pathology can lead to saving human lives, but models are a...
research
02/09/2023

An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning

Unsupervised object-centric representation (OCR) learning has recently d...
research
10/05/2021

Exploring the Limits of Large Scale Pre-training

Recent developments in large-scale machine learning suggest that by scal...
research
07/18/2022

Towards a General Pre-training Framework for Adaptive Learning in MOOCs

Adaptive learning aims to stimulate and meet the needs of individual lea...
research
03/13/2021

Revisiting ResNets: Improved Training and Scaling Strategies

Novel computer vision architectures monopolize the spotlight, but the im...

Please sign up or login with your details

Forgot password? Click here to reset