Adaptive Cross-Modal Few-Shot Learning

02/19/2019
by   Chen Xing, et al.
0

Metric-based meta-learning techniques have successfully been applied to few-shot classification problems. However, leveraging cross-modal information in a few-shot setting has yet to be explored. When the support from visual information is limited in few-shot image classification, semantic representatins (learned from unsupervised text corpora) can provide strong prior knowledge and context to help learning. Based on this intuition, we design a model that is able to leverage visual and semantic features in the context of few-shot classification. We propose an adaptive mechanism that is able to effectively combine both modalities conditioned on categories. Through a series of experiments, we show that our method boosts the performance of metric-based approaches by effectively exploiting language structure. Using this extra modality, our model bypass current unimodal state-of-the-art methods by a large margin on two important benchmarks: mini-ImageNet and tiered-ImageNet. The improvement in performance is particularly large when the number of shots are small.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset