Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data

10/21/2021
by   Matthew Howe, et al.
1

Accurate 7DoF prediction of vehicles at an intersection is an important task for assessing potential conflicts between road users. In principle, this could be achieved by a single camera system that is capable of detecting the pose of each vehicle but this would require a large, accurately labelled dataset from which to train the detector. Although large vehicle pose datasets exist (ostensibly developed for autonomous vehicles), we find training on these datasets inadequate. These datasets contain images from a ground level viewpoint, whereas an ideal view for intersection observation would be elevated higher above the road surface. We develop an alternative approach using a weakly supervised method of fine tuning 3D object detectors for traffic observation cameras; showing in the process that large existing autonomous vehicle datasets can be leveraged for pre-training. To fine-tune the monocular 3D object detector, our method utilises multiple 2D detections from overlapping, wide-baseline views and a loss that encodes the subjacent geometric consistency. Our method achieves vehicle 7DoF pose prediction accuracy on our dataset comparable to the top performing monocular 3D object detectors on autonomous vehicle datasets. We present our training methodology, multi-view reprojection loss, and dataset.

READ FULL TEXT

page 1

page 5

page 6

page 9

page 14

page 15

page 16

page 18

research
03/15/2023

Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency

Monocular 3D object detection has become a mainstream approach in automa...
research
03/17/2020

Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the Wild

One major challenge for monocular 3D human pose estimation in-the-wild i...
research
08/10/2021

MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision

Recently, huge strides were made in monocular and multi-view pose estima...
research
03/29/2021

Monocular 3D Vehicle Detection Using Uncalibrated Traffic Cameras through Homography

This paper proposes a method to extract the position and pose of vehicle...
research
03/09/2021

Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle Re-Identification

Learning cross-view consistent feature representation is the key for acc...
research
07/29/2020

What My Motion tells me about Your Pose: Self-Supervised Fine-Tuning of Observed Vehicle Orientation Angle

The determination of the relative 6 Degree of Freedom (DoF) pose of vehi...
research
10/05/2016

Find Your Own Way: Weakly-Supervised Segmentation of Path Proposals for Urban Autonomy

We present a weakly-supervised approach to segmenting proposed drivable ...

Please sign up or login with your details

Forgot password? Click here to reset