Multi-Label Image Recognition (MLIR) is a challenging task that aims to
...
Autonomous agents empowered by Large Language Models (LLMs) have undergo...
Image classification is a longstanding problem in computer vision and ma...
Semi-Supervised image classification is one of the most fundamental prob...
Virtual try-on is a critical image synthesis task that aims to transfer
...
Software engineering is a domain characterized by intricate decision-mak...
Molecular property prediction has gained significant attention due to it...
Synthesizing high-fidelity head avatars is a central problem for compute...
We investigate the potential of GPT-4~\cite{gpt4} to perform Neural
Arch...
Synthetic data has emerged as a promising source for 3D human research a...
We propose a robust method for learning neural implicit functions that c...
Tried-and-true flapping wing robot simulation is essential in developing...
Recent advances in modeling 3D objects mostly rely on synthetic datasets...
Implicit regularization is an important way to interpret neural networks...
Federated learning (FL) allows multiple clients cooperatively train mode...
Data generated at the network edge can be processed locally by leveragin...
Recent deep learning is superior in providing high-quality images and
ul...
In this work, we propose a Physics-Informed Deep Diffusion magnetic reso...
Semantic segmentation is an important and prevalent task, but severely
s...
Poisson surface reconstruction (PSR) remains a popular technique for
rec...
This paper investigates the task of 2D whole-body human pose estimation,...
Human pose estimation aims to accurately estimate a wide variety of huma...
Estimating 3D interacting hand pose from a single RGB image is essential...
Existing works on 2D pose estimation mainly focus on a certain category,...
Weakly supervised point cloud segmentation, i.e. semantically segmenting...
Morphable models are essential for the statistical modeling of 3D faces....
Recently, community has paid increasing attention on model scaling and
c...
Vision transformers (ViTs) are usually considered to be less light-weigh...
Conventional knowledge distillation (KD) methods for object detection ma...
Generic event boundary detection (GEBD) is an important yet challenging ...
We present an efficient approach for Masked Image Modeling (MIM) with
hi...
Unlike existing knowledge distillation methods focus on the baseline
set...
Unconditional human image generation is an important task in vision and
...
This work targets at using a general deep learning framework to synthesi...
This paper focuses on the weakly-supervised audio-visual video parsing t...
In this work, we tackle the problem of real-world fluid animation from a...
Vision transformers have achieved great successes in many computer visio...
Lane detection is a challenging task that requires predicting complex
to...
Context-aware decision support in the operating room can foster surgical...
Self-supervised learning (SSL) has made enormous progress and largely
na...
Searching for a more compact network width recently serves as an effecti...
Driving 3D characters to dance following a piece of music is highly
chal...
Structural re-parameterization (Rep) methods achieve noticeable improvem...
Self-supervised Learning (SSL) including the mainstream contrastive lear...
Learning with few labeled data has been a longstanding problem in the
co...
Localizing keypoints of an object is a basic visual problem. However,
su...
We propose near-optimal overlay networks based on d-regular expander gra...
Generic event boundary detection is an important yet challenging task in...
Deep learning has shown astonishing performance in accelerated magnetic
...
Prior plays an important role in providing the plausible constraint on h...