Text-Video Retrieval plays an important role in multi-modal understandin...
Generalized Zero-Shot Learning (GZSL) aims to recognize new categories w...
Cross-modal correlation provides an inherent supervision for video
unsup...
Generalized Zero-Shot Learning (GZSL) targets recognizing new categories...
Transductive Zero-shot learning (ZSL) targets to recognize the unseen
ca...
Bilinear pooling achieves great success in fine-grained visual recogniti...
Recent methods focus on learning a unified semantic-aligned visual
repre...
Zero-Shot Learning (ZSL) seeks to recognize a sample from either seen or...
Learning-based methods suffer from limited clean annotations, especially...