Segmentation-Based Bounding Box Generation for Omnidirectional Pedestrian Detection

by   Masato Tamura, et al.

We propose a segmentation-based bounding box generation method for omnidirectional pedestrian detection, which enables detectors to tightly fit bounding boxes to pedestrians without omnidirectional images for training. Because the appearance of pedestrians in omnidirectional images may be rotated to any angle, the performance of common pedestrian detectors is likely to be substantially degraded. Existing methods mitigate this issue by transforming images during inference or training detectors with omnidirectional images. However, the first approach substantially degrades the inference speed, and the second approach requires laborious annotations. To overcome these drawbacks, we leverage an existing large-scale dataset, whose segmentation annotations can be utilized, to generate tightly fitted bounding box annotations. We also develop a pseudo-fisheye distortion augmentation method, which further enhances the performance. Extensive analysis shows that our detector successfully fits bounding boxes to pedestrians and demonstrates substantial performance improvement.


page 3

page 6


We don't need no bounding-boxes: Training object class detectors using only human verification

Training object class detectors typically requires a large set of images...

Efficient Pedestrian Detection in Top-View Fisheye Images Using Compositions of Perspective View Patches

Pedestrian detection in images is a topic that has been studied extensiv...

Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection

Effective fusion of complementary information captured by multi-modal se...

Associative embeddings for large-scale knowledge transfer with self-assessment

We propose a method for knowledge transfer between semantically related ...

DocReader: Bounding-Box Free Training of a Document Information Extraction Model

Information extraction from documents is a ubiquitous first step in many...

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervision and Dynamic Self-Training

Although a polygon is a more accurate representation than an upright bou...

Visible Feature Guidance for Crowd Pedestrian Detection

Heavy occlusion and dense gathering in crowd scene make pedestrian detec...