We propose Encyclopedic-VQA, a large scale visual question answering (VQ...
Videos depict the change of complex dynamical systems over time in the f...
Coreference resolution aims at identifying words and phrases which refer...
In this paper, we present the details of Women in Computer Vision Worksh...
In this paper, we present the details of Women in Computer Vision Worksh...
Acquiring accurate labels on large-scale datasets is both time consuming...
Scene graph generation (SGG) aims to capture a wide variety of interacti...
Automatically generating natural language descriptions from an image is ...
In particular, the lack of sufficient amounts of domain-specific data ca...
Socially-intelligent agents are of growing interest in artificial
intell...
People naturally understand the emotions of-and often also empathize
wit...