HySTER: A Hybrid Spatio-Temporal Event Reasoner

01/17/2021
by   Theophile Sautory, et al.
0

The task of Video Question Answering (VideoQA) consists in answering natural language questions about a video and serves as a proxy to evaluate the performance of a model in scene sequence understanding. Most methods designed for VideoQA up-to-date are end-to-end deep learning architectures which struggle at complex temporal and causal reasoning and provide limited transparency in reasoning steps. We present the HySTER: a Hybrid Spatio-Temporal Event Reasoner to reason over physical events in videos. Our model leverages the strength of deep learning methods to extract information from video frames with the reasoning capabilities and explainability of symbolic artificial intelligence in an answer set programming framework. We define a method based on general temporal, causal and physics rules which can be transferred across tasks. We apply our model to the CLEVRER dataset and demonstrate state-of-the-art results in question answering accuracy. This work sets the foundations for the incorporation of inductive logic programming in the field of VideoQA.

READ FULL TEXT

page 3

page 7

research
04/25/2019

TVQA+: Spatio-Temporal Grounding for Video Question Answering

We present the task of Spatio-Temporal Video Question Answering, which r...
research
12/15/2020

Object-based attention for spatio-temporal reasoning: Outperforming neuro-symbolic models with flexible distributed architectures

Neural networks have achieved success in a wide array of perceptual task...
research
05/28/2019

Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs

The process of identifying changes or transformations in a scene along w...
research
12/08/2020

CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions

Recent advances in Artificial Intelligence and deep learning have revive...
research
08/05/2023

A criterion for Artificial General Intelligence: hypothetic-deductive reasoning, tested on ChatGPT

We argue that a key reasoning skill that any advanced AI, say GPT-4, sho...
research
05/03/2022

Episodic Memory Question Answering

Egocentric augmented reality devices such as wearable glasses passively ...
research
12/06/2016

MarioQA: Answering Questions by Watching Gameplay Videos

We present a framework to analyze various aspects of models for video qu...

Please sign up or login with your details

Forgot password? Click here to reset