Multi-task learning has emerged as a powerful paradigm to solve a range ...
We present a simple and effective framework, named Point2Seq, for 3D obj...
This paper presents a large-scale Chinese cross-modal dataset for
benchm...
Unsupervised large-scale vision-language pre-training has shown promisin...
We present a flexible and high-performance framework, named Pyramid R-CN...
We present Voxel Transformer (VoTr), a novel and effective voxel-based
T...
Current perception models in autonomous driving have become notorious fo...
User response prediction is a crucial component for personalized informa...
Introducing consumed items as users' implicit feedback in matrix
factori...