Recent incremental learning for action recognition usually stores
repres...
The task of Human-Object Interaction (HOI) detection targets fine-graine...
Current methods for LIDAR semantic segmentation are not robust enough fo...
Spatial convolutions are widely used in numerous deep video models. It
f...
The existence of noisy data is prevalent in both the training and testin...
Current approaches for video grounding propose kinds of complex architec...
The central idea of contrastive learning is to discriminate between diff...
Existing video copy detection methods generally measure video similarity...
Temporal action localization aims to localize starting and ending time w...
Weakly-Supervised Temporal Action Localization (WS-TAL) task aims to
rec...
This technical report presents our solution for temporal action detectio...
This paper presents our solution to the AVA-Kinetics Crossover Challenge...
This technical report analyzes an egocentric video action detection meth...
With the recent surge in the research of vision transformers, they have
...
Content-based video retrieval aims to find videos from a large video dat...