Long-tailed visual recognition has received increasing attention in rece...
Neural architecture search (NAS) proves to be among the effective approa...
The recent detection transformer (DETR) has advanced object detection, b...
Facial Expression Recognition (FER) in the wild is an extremely challeng...
The large pre-trained vision transformers (ViTs) have demonstrated remar...
No-reference image quality assessment (NR-IQA) aims to quantify how huma...
Vision transformer has emerged as a new paradigm in computer vision, sho...
Binary Neural Networks (BNNs) show great promise for real-world embedded...
Humans can continuously learn new knowledge. However, machine learning m...
Optical flow estimation is a classical yet challenging task in computer
...
Current semi-supervised semantic segmentation methods mainly focus on
de...
Few-shot segmentation (FSS) aims to segment novel categories given scarc...
The networks trained on the long-tailed dataset vary remarkably, despite...
Recently, it has attracted more and more attentions to fuse multi-scale
...
Facial expression recognition plays an important role in human-computer
...
Energy-based latent variable models (EBLVMs) are more expressive than
co...
This paper presents a new vision Transformer, named Iwin Transformer, wh...
In this paper, we propose an effective and efficient method for
Human-Ga...
Research on the generalization ability of deep neural networks (DNNs) ha...
Recently, Transformers have shown promising performance in various visio...
Recently, Transformers have shown promising performance in various visio...
A human's attention can intuitively adapt to corrupted areas of an image...
Real-time point cloud processing is fundamental for lots of computer vis...
Facial age estimation is an important yet very challenging problem in
co...
Face image animation from a single image has achieved remarkable progres...
Facial expression recognition (FER) has received increasing interest in
...
The threat of 3D masks to face recognition systems is increasingly serio...
Bicubic downscaling is a prevalent technique used to reduce the video st...
Efficiently modeling spatial-temporal information in videos is crucial f...
Transformers have shown impressive performance in various natural langua...
We present a versatile model, FaceAnime, for various video generation ta...
The objective of this paper is to learn context- and depth-aware feature...
Unmanned Aerial Vehicle (UAV) offers lots of applications in both commer...
Recently, context reasoning using image regions beyond local convolution...
Traditional neural architecture search (NAS) has a significant impact in...
Modern CNN-based object detectors focus on feature configuration during
...
The objective of this paper is self-supervised representation learning, ...
Conventional learning methods simplify the bilinear model by regarding t...
We have witnessed rapid advances in both face presentation attack models...
Regardless of the usage of deep learning and handcrafted methods, the dy...
In the era of multimedia and Internet, people are eager to obtain inform...
Training 1-bit deep convolutional neural networks (DCNNs) is one of the ...
Small object tracking becomes an increasingly important task, which howe...
In last few decades, a lot of progress has been made in the field of fac...
Pedestrian detection has achieved significant progress with the availabi...
Binarized convolutional neural networks (BCNNs) are widely used to impro...
Deep convolutional neural networks (DCNNs) have dominated the recent
dev...
Scene parsing is challenging as it aims to assign one of the semantic
ca...
The ChaLearn large-scale gesture recognition challenge has been run twic...
The multi-domain image-to-image translation is received increasing atten...