Inspired by masked language modeling (MLM) in natural language processin...
Autoregressive language modeling (ALM) have been successfully used in
se...
We present a simple yet effective end-to-end Video-language Pre-training...
Visual tasks vary a lot in their output formats and concerned contents,
...
Self-supervised learning (SSL) holds promise in leveraging large amounts...
In person re-identification (ReID), very recent researches have validate...
Transformer has achieved great success in computer vision, while how to ...
Transformer has been widely used for self-supervised pre-training in Nat...
To address the problem of long-tail distribution for the large vocabular...
Recent work attempts to improve semantic segmentation performance by
exp...
The region-based Convolutional Neural Network (CNN) detectors such as Fa...