Deep Neural Networks can be easily fooled by small and imperceptible
per...
This work dedicates to continuous sign language recognition (CSLR), whic...
This work focuses on sign language retrieval-a recently proposed task fo...
Sign languages are visual languages which convey information by signers'...
Deep supervision, which involves extra supervisions to the intermediate
...
This paper presents a new framework for open-vocabulary semantic segment...
Masked image modeling (MIM) performs strongly in pre-training large visi...
Image token removal is an efficient augmentation strategy for reducing t...
Sign languages are visual languages using manual articulations and non-m...
In this paper, we are interested in Detection Transformer (DETR), an
end...
Zero-shot learning (ZSL) aims to recognize classes that do not have samp...
Contrastive vision-language models like CLIP have shown great progress i...
Prior works on action representation learning mainly focus on designing
...
Recently, vision-language pre-training shows great potential in
open-voc...
This paper proposes a simple transfer learning baseline for sign languag...
Recently, zero-shot image classification by vision-language pre-training...
Semi-supervised action recognition is a challenging but important task d...
For human action understanding, a popular research direction is to analy...
We introduce MixTraining, a new training paradigm for object detection t...
Due to the limited and even imbalanced data, semi-supervised semantic
se...
The recent progress of CNN has dramatically improved face alignment
perf...
Domain adaptation for semantic segmentation enables to alleviate the nee...
This paper presents an end-to-end semi-supervised object detection appro...
Image-level contrastive representation learning has proven to be highly
...
Cycle consistency is widely used for face editing. However, we observe t...
The Non-Local Network (NLNet) presents a pioneering approach for capturi...
Existing object detection frameworks are usually built on a single forma...
Few-shot learning has recently emerged as a new challenge in the deep
le...
A recent approach for object detection and human pose estimation is to
r...
We consider universal adversarial patches for faces - small visual eleme...
The Non-Local Network (NLNet) presents a pioneering approach for capturi...
This paper presents a review of the 2018 WIDER Challenge on Face and
Ped...