DETR-based object detectors have achieved remarkable performance but are...
Vision-Language (V-L) models trained with contrastive learning to align ...
This paper is on Few-Shot Object Detection (FSOD), where given a few
tem...
Despite the impressive progress of self-supervised learning (SSL), its
a...
Prompt tuning provides an efficient mechanism to adapt large vision-lang...
This work is on training a generative action/video recognition model who...
This paper tackles the problem of efficient video recognition. In this a...
Learning visual representations through self-supervision is an extremely...
Existing knowledge distillation methods mostly focus on distillation of
...
Self-attention based models such as vision transformers (ViTs) have emer...
Learning an egocentric action recognition model from video data is
chall...
This report presents the technical details of our submission to the
EPIC...
This paper is on video recognition using Transformers. Very recent attem...
Temporal action localization (TAL) is a fundamental yet challenging task...
Few-shot action recognition aims to recognize action classes with few
tr...
Many video analysis tasks require temporal localization thus detection o...
Network binarization is a promising hardware-aware direction for creatin...
Lipreading has witnessed a lot of progress due to the resurgence of neur...
We present the submission of Samsung AI Centre Cambridge to the CVPR2020...
Attentive video modeling is essential for action recognition in unconstr...
This paper shows how to train binary networks to within a few percent po...
This paper addresses the problem of model compression via knowledge
dist...
This paper proposes Binary ArchitecTure Search (BATS), a framework that
...
Lip-reading has attracted a lot of research attention lately thanks to
a...
Action recognition has seen a dramatic performance improvement in the la...
Automatic continuous time, continuous value assessment of a patient's pa...
Linear regression is a fundamental building block in many face detection...
This paper introduces a novel real-time algorithm for facial landmark
tr...