DeepAI AI Chat
Log In Sign Up

MoPro: Webly Supervised Learning with Momentum Prototypes

by   Junnan Li, et al.

We propose a webly-supervised representation learning method that does not suffer from the annotation unscalability of supervised learning, nor the computation unscalability of self-supervised learning. Most existing works on webly-supervised representation learning adopt a vanilla supervised learning method without accounting for the prevalent noise in the training data, whereas most prior methods in learning with label noise are less effective for real-world large-scale noisy data. We propose momentum prototypes (MoPro), a simple contrastive learning method that achieves online label noise correction, out-of-distribution sample removal, and representation learning. MoPro achieves state-of-the-art performance on WebVision, a weakly-labeled noisy dataset. MoPro also shows superior performance when the pretrained model is transferred to down-stream image classification and detection tasks. It outperforms the ImageNet supervised pretrained model by +10.5 on 1-shot classification on VOC, and outperforms the best self-supervised pretrained model by +17.3 when finetuned on 1% of ImageNet labeled samples. Furthermore, MoPro is more robust to distribution shifts. Code and pretrained models are available at


page 12

page 13


Self supervised contrastive learning for digital histopathology

Unsupervised learning has been a long-standing goal of machine learning ...

Self-Supervised RF Signal Representation Learning for NextG Signal Classification with Deep Learning

Deep learning (DL) finds rich applications in the wireless domain to imp...

Self-Supervised Learning by Estimating Twin Class Distributions

We present TWIST, a novel self-supervised representation learning method...

Neglected Free Lunch – Learning Image Classifiers Using Annotation Byproducts

Supervised learning of image classifiers distills human knowledge into a...

Contrastive Visual-Linguistic Pretraining

Several multi-modality representation learning approaches such as LXMERT...

PiCO: Contrastive Label Disambiguation for Partial Label Learning

Partial label learning (PLL) is an important problem that allows each tr...

Okapi: Generalising Better by Making Statistical Matches Match

We propose Okapi, a simple, efficient, and general method for robust sem...

Code Repositories


MoPro: Webly Supervised Learning

view repo