Scene graph is structured semantic representation that can be modeled as...
Knowledge-based visual question answering (QA) aims to answer a question...
We aim to develop an AI agent that can watch video clips and have a
conv...
Developing video understanding intelligence is quite challenging because...
Despite recent progress on computer vision and natural language processi...
Conventional sequential learning methods such as Recurrent Neural Networ...
Conventional sequential learning methods such as Recurrent Neural Networ...
Video understanding is emerging as a new paradigm for studying human-lik...
While conventional methods for sequential learning focus on interaction
...
Goal-oriented dialogue has been paid attention for its numerous applicat...