Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning

03/20/2023
by   Sungnyun Kim, et al.
1

Deep learning in general domains has constantly been extended to domain-specific tasks requiring the recognition of fine-grained characteristics. However, real-world applications for fine-grained tasks suffer from two challenges: a high reliance on expert knowledge for annotation and necessity of a versatile model for various downstream tasks in a specific domain (e.g., prediction of categories, bounding boxes, or pixel-wise annotations). Fortunately, the recent self-supervised learning (SSL) is a promising approach to pretrain a model without annotations, serving as an effective initialization for any downstream tasks. Since SSL does not rely on the presence of annotation, in general, it utilizes the large-scale unlabeled dataset, referred to as an open-set. In this sense, we introduce a novel Open-Set Self-Supervised Learning problem under the assumption that a large-scale unlabeled open-set is available, as well as the fine-grained target dataset, during a pretraining phase. In our problem setup, it is crucial to consider the distribution mismatch between the open-set and target dataset. Hence, we propose SimCore algorithm to sample a coreset, the subset of an open-set that has a minimum distance to the target dataset in the latent space. We demonstrate that SimCore significantly improves representation learning performance through extensive experimental settings, including eleven fine-grained datasets and seven open-sets in various downstream tasks.

READ FULL TEXT

page 2

page 6

page 7

page 12

page 13

page 14

research
07/29/2021

Self-Supervised Learning for Fine-Grained Image Classification

Fine-grained image classification involves identifying different subcate...
research
05/27/2021

Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach

We propose to measure fine-grained domain relevance - the degree that a ...
research
07/24/2022

Explored An Effective Methodology for Fine-Grained Snake Recognition

Fine-Grained Visual Classification (FGVC) is a longstanding and fundamen...
research
01/01/2023

MTNeuro: A Benchmark for Evaluating Representations of Brain Structure Across Multiple Levels of Abstraction

There are multiple scales of abstraction from which we can describe the ...
research
05/26/2022

HIRL: A General Framework for Hierarchical Image Representation Learning

Learning self-supervised image representations has been broadly studied ...
research
09/07/2021

Fine-grained Hand Gesture Recognition in Multi-viewpoint Hand Hygiene

This paper contributes a new high-quality dataset for hand gesture recog...
research
06/27/2020

Open Domain Suggestion Mining Leveraging Fine-Grained Analysis

Suggestion mining tasks are often semantically complex and lack sophisti...

Please sign up or login with your details

Forgot password? Click here to reset