DeepAI AI Chat
Log In Sign Up

CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping

by   Junlin Han, et al.
The University of Adelaide
Australian National University

We present a simple method, CropMix, for the purpose of producing a rich input distribution from the original dataset distribution. Unlike single random cropping, which may inadvertently capture only limited information, or irrelevant information, like pure background, unrelated objects, etc, we crop an image multiple times using distinct crop scales, thereby ensuring that multi-scale information is captured. The new input distribution, serving as training data, useful for a number of vision tasks, is then formed by simply mixing multiple cropped views. We first demonstrate that CropMix can be seamlessly applied to virtually any training recipe and neural network architecture performing classification tasks. CropMix is shown to improve the performance of image classifiers on several benchmark tasks across-the-board without sacrificing computational simplicity and efficiency. Moreover, we show that CropMix is of benefit to both contrastive learning and masked image modeling towards more powerful representations, where preferable results are achieved when learned representations are transferred to downstream tasks. Code is available at GitHub.


page 2

page 4

page 17

page 18


You Only Cut Once: Boosting Data Augmentation with a Single Cut

We present You Only Cut Once (YOCO) for performing data augmentations. Y...

Multi-scale frequency separation network for image deblurring

Image deblurring aims to restore the detailed texture information or str...

Multimodal Masked Autoencoders Learn Transferable Representations

Building scalable models to learn from diverse, multimodal data remains ...

Matryoshka Representations for Adaptive Deployment

Learned representations are a central component in modern ML systems, se...

Improvements to Self-Supervised Representation Learning for Masked Image Modeling

This paper explores improvements to the masked image modeling (MIM) para...

Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells

Unsupervised text encoding models have recently fueled substantial progr...

MTNeuro: A Benchmark for Evaluating Representations of Brain Structure Across Multiple Levels of Abstraction

There are multiple scales of abstraction from which we can describe the ...

Code Repositories


Code of CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping

view repo