Content-aware visual-textual presentation layout aims at arranging spati...
Existing audio analysis methods generally first transform the audio stre...
Video temporal grounding aims to pinpoint a video segment that matches t...
Fine-grained visual categorization (FGVC) aims at recognizing objects fr...
Existing video copy detection methods generally measure video similarity...
Content-based video retrieval aims to find videos from a large video dat...
Cross-media retrieval is to return the results of various media types
co...
Fine-grained image classification is to recognize hundreds of subcategor...
Discriminative localization is essential for fine-grained image
classifi...
Fine-grained image classification is a challenging task due to the large...
Fine-grained image classification is to recognize hundreds of subcategor...