The world is filled with articulated objects that are difficult to deter...
Spatio-temporal scene-graph approaches to video-based reasoning tasks su...
Recent advances in generative adversarial networks (GANs) have led to
re...
In previous work, we have proposed the Audio-Visual Scene-Aware Dialog (...
In this paper, we present InSeGAN, an unsupervised 3D generative adversa...
Modern face alignment methods have become quite accurate at predicting t...
Generating video descriptions automatically is a challenging task that
i...
This paper introduces the Eighth Dialog System Technology Challenge. In ...
We introduce the task of scene-aware dialog. Given a follow-up question ...
This paper introduces the Seventh Dialog System Technology Challenges (D...
Dialog systems need to understand dynamic visual scenes in order to have...
Scene-aware dialog systems will be able to have conversations with users...
In recent years, it is common practice to extract fully-connected layer ...
Currently successful methods for video description are based on
encoder-...
Face alignment, which is the task of finding the locations of a set of f...