Effective image restoration with large-size corruptions, such as blind i...
Existing methods for interactive segmentation in radiance fields entail
...
Recent interactive segmentation methods iteratively take source image, u...
Although extensive research has been conducted on 3D point cloud
segment...
This paper aims to solve the video object segmentation (VOS) task in a
s...
People may perform diverse gestures affected by various mental and physi...
Despite the remarkable success of existing methods for few-shot segmenta...
Vision Transformers have achieved remarkable progresses, among which Swi...
Few-shot learning (FSL) targets at generalization of vision models towar...
The traditional homography estimation pipeline consists of four main ste...
The crux of text-to-image synthesis stems from the difficulty of preserv...
Typical methods for blind image super-resolution (SR) focus on dealing w...
While fine-tuning based methods for few-shot object detection have achie...
Existing few-shot segmentation methods have achieved great progress base...
Most of existing methods for few-shot object detection follow the fine-t...
The key challenge of sequence representation learning is to capture the
...
While the research on image background restoration from regular size of
...
Typical text spotters follow the two-stage spotting strategy: detect the...
The crux of long-term tracking lies in the difficulty of tracking the ta...
Few-shot classification aims to adapt classifiers to novel classes with ...
While Transformer has achieved remarkable performance in various high-le...
Typical methods for pedestrian detection focus on either tackling mutual...
Real-world data usually present long-tailed distributions. Training on
i...
The crux of single-channel speech separation is how to encode the mixtur...
Most existing methods for image inpainting focus on learning the intra-i...
Generating conversational gestures from speech audio is challenging due ...
Most existing trackers based on deep learning perform tracking in a holi...
The current Siamese network based on region proposal network (RPN) has
a...
Restoring the clean background from the superimposed images containing a...
Partially supervised instance segmentation aims to perform learning on
l...
Typical methods for text-to-image synthesis seek to design effective
gen...
Current massive datasets demand light-weight access for analysis. Discre...
State-of-the-art image captioning methods mostly focus on improving visu...
Person re-identification aims to identify whether pairs of images belong...
Typical methods for supervised sequence modeling are built upon the recu...
Typical techniques for video captioning follow the encoder-decoder frame...
Traditional machine learning models have problems with handling sequence...
The main challenges of age estimation from facial expression videos lie ...
Capturing the temporal dynamics of user preferences over items is import...
Typical techniques for sequence classification are designed for
well-seg...
Traditional techniques for measuring similarities between time series ar...
We present a new model for time series classification, called the hidden...