Vision Transformers (ViTs) have achieved remarkable success in computer
...
Weakly supervised object localization (WSOL) remains challenging when
le...
This work studies how to transform an album to vivid and coherent storie...
Large vision Transformers (ViTs) driven by self-supervised pre-training
...
Visible-infrared person re-identification (VI-ReID) aims to match specif...
Unsupervised domain adaptation person re-identification (Re-ID) aims to
...
Camouflaged objects are seamlessly blended in with their surroundings, w...
Adapting object detectors learned with sufficient supervision to novel
c...
Pedestrian detection in the wild remains a challenging problem especiall...
Pedestrian detection in the wild remains a challenging problem especiall...
In this paper, we present an integral pre-training framework based on ma...
Contrastive self-supervised learning (CSL) based on instance discriminat...
Masked image modeling has demonstrated great potential to eliminate the
...
In this paper, we propose multi-agent automated machine learning (MA2ML)...
Few-shot class-incremental learning (FSCIL) faces challenges of memorizi...
Masked image modeling (MIM) has demonstrated impressive results in
self-...
Recently, masked image modeling (MIM) has offered a new methodology of
s...
Modern object detectors have taken the advantages of pre-trained vision
...
Most of the existing work in one-stage referring expression comprehensio...
The past year has witnessed a rapid development of masked image modeling...
Point-based object localization (POL), which pursues high-performance ob...
Recently, automatic video captioning has attracted increasing attention,...
Semi-supervised object detection (SSOD) has achieved substantial progres...
Bounding-box annotation form has been the most frequently used method fo...
The existing neural architecture search algorithms are mostly working on...
Gating modules have been widely explored in dynamic network pruning to r...
In this paper, we propose a self-supervised visual representation learni...
Recognizing images with long-tailed distributions remains a challenging
...
Exploiting relations among 2D joints plays a crucial role yet remains
se...
A resource-adaptive supernet adjusts its subnets for inference to fit th...
Language instruction plays an essential role in the natural language gro...
Unsupervised person re-identification (re-ID) remains a challenging task...
Conventional gradient descent methods compute the gradients for multiple...
Encouraging progress in few-shot semantic segmentation has been made by
...
Channel pruning and tensor decomposition have received extensive attenti...
Within Convolutional Neural Network (CNN), the convolution operations ar...
Despite the substantial progress of active learning for image recognitio...
Few-shot class-incremental learning (FSCIL), which targets at continuous...
Weakly supervised object localization (WSOL) is a challenging problem wh...
Few-shot object detection has made substantial progressby representing n...
Popular network pruning algorithms reduce redundant information by optim...
With only bounding-box annotations in the spatial domain, existing video...
Conventional networks for object skeleton detection are usually hand-cra...
The 1st Tiny Object Detection (TOD) Challenge aims toencourage research ...
Few-shot segmentation is challenging because objects within the support ...
In this paper, we present a large-scale Diverse Real-world image
Super-R...
Unsupervised domain adaptation (UDA) aims to address the domain-shift pr...
The search cost of neural architecture search (NAS) has been largely red...
In unsupervised feature learning, sample specificity based methods ignor...
Often the best performing deep neural models are ensembles of multiple
b...