
-
A Multiplexed Network for End-to-End, Multilingual OCR
Recent advances in OCR have shown that an end-to-end (E2E) training pipe...
read it
-
KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge Graph
Entity synonyms discovery is crucial for entity-leveraging applications....
read it
-
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation
We propose real-time, six degrees of freedom (6DoF), 3D face pose estima...
read it
-
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
In this paper, we propose Text-Aware Pre-training (TAP) for Text-VQA and...
read it
-
VIVO: Surpassing Human Performance in Novel Object Captioning with Visual Vocabulary Pre-Training
It is highly desirable yet challenging to generate image captions that c...
read it
-
Hashing-based Non-Maximum Suppression for Crowded Object Detection
In this paper, we propose an algorithm, named hashing-based non-maximum ...
read it
-
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Large-scale pre-training methods of learning cross-modal representations...
read it
-
FAN: Feature Adaptation Network for Surveillance Face Recognition and Normalization
This paper studies face recognition (FR) and normalization in surveillan...
read it
-
Gait Recognition via Disentangled Representation Learning
Gait, the walking pattern of individuals, is one of the most important b...
read it
-
Feature Transfer Learning for Deep Face Recognition with Long-Tail Data
Real-world face recognition datasets exhibit long-tail characteristics, ...
read it
-
Illuminating Pedestrians via Simultaneous Detection & Segmentation
Pedestrian detection is a critical problem in computer vision with signi...
read it
-
Representation Learning by Rotating Your Faces
The large pose discrepancy between two face images is one of the fundame...
read it
-
Joint Multi-Leaf Segmentation, Alignment and Tracking from Fluorescence Plant Videos
This paper proposes a novel framework for fluorescence plant video proce...
read it