3D visual grounding is the task of localizing the object in a 3D scene w...
Despite their simpler information fusion designs compared with Vision
Tr...
With autonomous industries on the rise, domain adaptation of the visual
...
We propose a novel domain adaptive action detection approach and a new
a...
Current methods for spatiotemporal action tube detection often extend a
...
We address the problem of face anti-spoofing which aims to make the face...
We present an approach for encoding visual task relationships to improve...
Humans approach driving in a holistic fashion which entails, in particul...
Training deep networks for semantic segmentation requires large amounts ...
Multi-task networks are commonly utilized to alleviate the need for a la...
In this paper, we propose Two-Stream AMTnet, which leverages recent adva...
Nowadays, the increasingly growing number of mobile and computing device...
In this work, we present a method to predict an entire `action tube' (a ...
Current state-of-the-art methods solve spatiotemporal action localisatio...
We present the new Road Event and Activity Detection (READ) dataset, des...
In this paper, we present an unsupervised learning approach for analyzin...
Current state-of-the-art human action recognition is focused on the
clas...
Current state-of-the-art action detection systems are tailored for offli...
We present a deep-learning framework for real-time multiple spatio-tempo...
In this work, we propose an approach to the spatiotemporal localisation
...
In August 2011, Linux entered its third decade. Ten years before, Chou e...