Robust 2D/3D Vehicle Parsing in CVIS

03/11/2021
by   Hui Miao, et al.
5

We present a novel approach to robustly detect and perceive vehicles in different camera views as part of a cooperative vehicle-infrastructure system (CVIS). Our formulation is designed for arbitrary camera views and makes no assumptions about intrinsic or extrinsic parameters. First, to deal with multi-view data scarcity, we propose a part-assisted novel view synthesis algorithm for data augmentation. We train a part-based texture inpainting network in a self-supervised manner. Then we render the textured model into the background image with the target 6-DoF pose. Second, to handle various camera parameters, we present a new method that produces dense mappings between image pixels and 3D points to perform robust 2D/3D vehicle parsing. Third, we build the first CVIS dataset for benchmarking, which annotates more than 1540 images (14017 instances) from real-world traffic scenarios. We combine these novel algorithms and datasets to develop a robust approach for 2D/3D vehicle parsing for CVIS. In practice, our approach outperforms SOTA methods on 2D detection, instance segmentation, and 6-DoF pose estimation, by 4.5 respectively. More details and results are included in the supplement. To facilitate future research, we will release the source code and the dataset on GitHub.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 7

page 8

research
07/16/2020

PerMO: Perceiving More at Once from a Single Image for Autonomous Driving

We present a novel approach to detect, segment, and reconstruct complete...
research
12/15/2020

Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation

Holistically understanding an object and its 3D movable parts through vi...
research
04/10/2020

Parsing-based View-aware Embedding Network for Vehicle Re-Identification

Vehicle Re-Identification is to find images of the same vehicle from var...
research
11/30/2018

Parsing R-CNN for Instance-Level Human Analysis

Instance-level human analysis is common in real-life scenarios and has m...
research
01/06/2022

Enhancing Egocentric 3D Pose Estimation with Third Person Views

In this paper, we propose a novel approach to enhance the 3D body pose e...
research
05/12/2022

Building Facade Parsing R-CNN

Building facade parsing, which predicts pixel-level labels for building ...
research
03/16/2023

NeRFtrinsic Four: An End-To-End Trainable NeRF Jointly Optimizing Diverse Intrinsic and Extrinsic Camera Parameters

Novel view synthesis using neural radiance fields (NeRF) is the state-of...

Please sign up or login with your details

Forgot password? Click here to reset