Massively Scaling Heteroscedastic Classifiers

01/30/2023
by   Mark Collier, et al.
0

Heteroscedastic classifiers, which learn a multivariate Gaussian distribution over prediction logits, have been shown to perform well on image classification problems with hundreds to thousands of classes. However, compared to standard classifiers, they introduce extra parameters that scale linearly with the number of classes. This makes them infeasible to apply to larger-scale problems. In addition heteroscedastic classifiers introduce a critical temperature hyperparameter which must be tuned. We propose HET-XL, a heteroscedastic classifier whose parameter count when compared to a standard classifier scales independently of the number of classes. In our large-scale settings, we show that we can remove the need to tune the temperature hyperparameter, by directly learning it on the training data. On large image classification datasets with up to 4B images and 30k classes our method requires 14X fewer additional parameters, does not require tuning the temperature on a held-out set and performs consistently better than the baseline heteroscedastic classifier. HET-XL improves ImageNet 0-shot classification in a multimodal contrastive learning setup which can be viewed as a 3.5 billion class classification problem.

READ FULL TEXT
research
10/07/2021

Using Contrastive Learning and Pseudolabels to learn representations for Retail Product Image Classification

Retail product Image classification problems are often few shot classifi...
research
06/07/2018

Large scale classification in deep neural network with Label Mapping

In recent years, deep neural network is widely used in machine learning....
research
11/20/2015

Top-k Multiclass SVM

Class ambiguity is typical in image classification problems with a large...
research
10/18/2022

Fine-tune your Classifier: Finding Correlations With Temperature

Temperature is a widely used hyperparameter in various tasks involving n...
research
10/30/2019

Are Out-of-Distribution Detection Methods Effective on Large-Scale Datasets?

Supervised classification methods often assume the train and test data d...
research
12/14/2021

Exploring Category-correlated Feature for Few-shot Image Classification

Few-shot classification aims to adapt classifiers to novel classes with ...
research
10/09/2018

Extreme Classification in Log Memory

We present Merged-Averaged Classifiers via Hashing (MACH) for K-classifi...

Please sign up or login with your details

Forgot password? Click here to reset