Automatically Discovering and Learning New Visual Categories with Ranking Statistics

02/13/2020
by   Kai Han, et al.
13

We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes. This setting is similar to semi-supervised learning, but significantly harder because there are no labelled examples for the new classes. The challenge, then, is to leverage the information contained in the labelled images in order to learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data. In this work we address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labeled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use rank statistics to transfer the model's knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data. We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2021

AutoNovel: Automatically Discovering and Learning Novel Visual Categories

We tackle the problem of discovering novel classes in an image collectio...
research
07/07/2021

Novel Visual Category Discovery with Dual Ranking Statistics and Mutual Knowledge Distillation

In this paper, we tackle the problem of novel visual category discovery,...
research
07/01/2021

The Spotlight: A General Method for Discovering Systematic Errors in Deep Learning Models

Supervised learning models often make systematic errors on rare subsets ...
research
12/10/2021

The Large Labelled Logo Dataset (L3D): A Multipurpose and Hand-Labelled Continuously Growing Dataset

In this work, we present the Large Labelled Logo Dataset (L3D), a multip...
research
05/21/2019

Semi-Supervised Learning with Scarce Annotations

While semi-supervised learning (SSL) algorithms provide an efficient way...
research
08/26/2019

Learning to Discover Novel Visual Categories via Deep Transfer Clustering

We consider the problem of discovering novel object categories in an ima...
research
04/14/2023

CiPR: An Efficient Framework with Cross-instance Positive Relations for Generalized Category Discovery

We tackle the issue of generalized category discovery (GCD). GCD conside...

Please sign up or login with your details

Forgot password? Click here to reset