We propose a novel multimodal video benchmark - the Perception Test - to...
We introduce a challenging decision-making task that we call active
acqu...
Most successful self-supervised learning methods are trained to align th...
We introduce a class of causal video understanding models that aims to
i...
We introduce gvnn, a neural network library in Torch aimed towards bridg...
We are interested in automatic scene understanding from geometric cues. ...