Not All Instances Contribute Equally: Instance-adaptive Class Representation Learning for Few-Shot Visual Recognition

by   Mengya Han, et al.

Few-shot visual recognition refers to recognize novel visual concepts from a few labeled instances. Many few-shot visual recognition methods adopt the metric-based meta-learning paradigm by comparing the query representation with class representations to predict the category of query instance. However, current metric-based methods generally treat all instances equally and consequently often obtain biased class representation, considering not all instances are equally significant when summarizing the instance-level representations for the class-level representation. For example, some instances may contain unrepresentative information, such as too much background and information of unrelated concepts, which skew the results. To address the above issues, we propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition. Specifically, we develop an adaptive instance revaluing network with the capability to address the biased representation issue when generating the class representation, by learning and assigning adaptive weights for different instances according to their relative significance in the support set of corresponding class. Additionally, we design an improved bilinear instance representation and incorporate two novel structural losses, i.e., intra-class instance clustering loss and inter-class representation distinguishing loss, to further regulate the instance revaluation process and refine the class representation. We conduct extensive experiments on four commonly adopted few-shot benchmarks: miniImageNet, tieredImageNet, CIFAR-FS, and FC100 datasets. The experimental results compared with the state-of-the-art approaches demonstrate the superiority of our ICRL-Net.


page 1

page 4

page 11

page 14


Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition

Humans can easily recognize actions with only a few examples given, whil...

MGIMN: Multi-Grained Interactive Matching Network for Few-shot Text Classification

Text classification struggles to generalize to unseen classes with very ...

ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning

Recently, the transductive graph-based methods have achieved great succe...

Adaptive Prototypical Networks with Label Words and Joint Representation Learning for Few-Shot Relation Classification

Relation classification (RC) task is one of fundamental tasks of informa...

Query Adaptive Few-Shot Object Detection with Heterogeneous Graph Convolutional Networks

Few-shot object detection (FSOD) aims to detect never-seen objects using...

Contextualizing Multiple Tasks via Learning to Decompose

One single instance could possess multiple portraits and reveal diverse ...

An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling

Few-Shot Sequence Labeling (FSSL) is a canonical solution for the taggin...

Please sign up or login with your details

Forgot password? Click here to reset