Dialogue systems are frequently updated to accommodate new services, but...
There is growing interest in searching for information from large video
...
Recent years have seen an increasing trend in the volume of personal med...
People capture photos and videos to relive and share memories of persona...
As task-oriented dialog systems are becoming increasingly popular in our...
While deep learning through empirical risk minimization (ERM) has succee...
We present a new corpus for the Situated and Interactive Multimodal
Conv...
A video-grounded dialogue system is required to understand both dialogue...
This paper introduces the Ninth Dialog System Technology Challenge (DSTC...
Next generation virtual assistants are envisioned to handle multimodal i...
Several recent works have found the emergence of grounded compositional
...
Visual Dialog is a multimodal task of answering a sequence of questions
...
Visual dialog entails answering a series of questions grounded in an ima...
A number of recent works have proposed techniques for end-to-end learnin...
We introduce the first goal-driven training for visual question answerin...
In this paper, we study the problem of designing objective functions for...
We introduce the task of Visual Dialog, which requires an AI agent to ho...
We propose a model to learn visually grounded word embeddings (vis-w2v) ...