Vision Transformers have shown great promise recently for many vision ta...
Open-vocabulary semantic segmentation aims to segment an image into sema...
The technology of hyperspectral imaging (HSI) records the visual informa...
Fully exploiting the learning capacity of neural networks requires
overp...
Sign language is commonly used by deaf or mute people to communicate but...
Sign language is a visual language that is used by deaf or speech impair...
Learning visual knowledge from massive weakly-labeled web videos has
att...
This paper makes a selective survey on the recent development of the fac...
The recent flourish of deep learning in various tasks is largely accredi...
Image-text matching has been a hot research topic bridging the vision an...
In this paper, we propose a residual non-local attention network for
hig...
Person re-identification (re-ID) has recently been tremendously boosted ...
Convolutional neural network (CNN) depth is of crucial importance for im...
Weakly supervised learning with only coarse labels can obtain visual
exp...