Audio visual segmentation (AVS) aims to segment the sounding objects for...
Monocular 3D lane detection is a challenging task due to its lack of dep...
As one of the fundamental functions of autonomous driving system, freesp...
Lane detection is a challenging task that requires predicting complex
to...
Recently proposed fine-grained 3D visual grounding is an essential and
c...
Given a natural language expression and an image/video, the goal of refe...
Language-queried video actor segmentation aims to predict the pixel-leve...
Learning to capture dependencies between spatial positions is essential ...
Referring image segmentation aims to predict the foreground mask of the
...
Referring image segmentation aims at segmenting the foreground masks of ...