Large-scale pre-trained Vision-Language Models (VLMs), such as CLIP and
...
Recently, large-scale pre-trained vision-language models (e.g. CLIP and
...
Large-scale pre-trained Vision-Language Models (VLMs), such as CLIP,
est...
One central challenge in source-free unsupervised domain adaptation (UDA...
Given an unknown dynamical system, what is the minimum number of samples...
A central challenge in human pose estimation, as well as in many other
m...
Fully test-time adaptation aims to adapt the network model based on
sequ...
A fundamental challenge in deep metric learning is the generalization
ca...
We observe that human poses exhibit strong group-wise structural correla...
Deep neural networks recognize objects by analyzing local image details ...
Effective defense of deep neural networks against adversarial attacks re...
We observe that the human trajectory is not only forward predictable, bu...
In this work, we develop a joint sample discovery and iterative model
ev...
The inference structures and computational complexity of existing deep n...
The task of multi-person human pose estimation in natural scenes is quit...
While deeper and wider neural networks are actively pushing the performa...
Virtual Learning Environments (VLEs) are spaces designed to educate stud...
Human pose estimation using deep neural networks aims to map input image...
Currently, the state-of-the-art image classification algorithms outperfo...