FoveaNet: Perspective-aware Urban Scene Parsing

08/08/2017
by   Xin Li, et al.
0

Parsing urban scene images benefits many applications, especially self-driving. Most of the current solutions employ generic image parsing models that treat all scales and locations in the images equally and do not consider the geometry property of car-captured urban scene images. Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and recognition errors. In this work, we propose a new FoveaNet model to fully exploit the perspective geometry of scene images and address the common failures of generic parsing models. FoveaNet estimates the perspective geometry of a scene image through a convolutional network which integrates supportive evidence from contextual objects within the image. Based on the perspective geometry information, FoveaNet "undoes" the camera perspective projection analyzing regions in the space of the actual scene, and thus provides much more reliable parsing results. Furthermore, to effectively address the recognition errors, FoveaNet introduces a new dense CRFs model that takes the perspective geometry as a prior potential. We evaluate FoveaNet on two urban scene parsing datasets, Cityspaces and CamVid, which demonstrates that FoveaNet can outperform all the well-established baselines and provide new state-of-the-art performance.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

page 8

research
03/06/2023

Traffic Scene Parsing through the TSP6K Dataset

Traffic scene parsing is one of the most important tasks to achieve inte...
research
01/09/2017

Information Pursuit: A Bayesian Framework for Sequential Scene Parsing

Despite enormous progress in object detection and classification, the pr...
research
08/30/2022

Boosting Night-time Scene Parsing with Learnable Frequency

Night-Time Scene Parsing (NTSP) is essential to many vision applications...
research
12/01/2016

Video Scene Parsing with Predictive Feature Learning

In this work, we address the challenging video scene parsing problem by ...
research
05/24/2019

Multi-Scale Dual-Branch Fully Convolutional Network for Hand Parsing

Recently, fully convolutional neural networks (FCNs) have shown signific...
research
01/16/2021

GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving

Scalable sensor simulation is an important yet challenging open problem ...
research
02/04/2023

CLiNet: Joint Detection of Road Network Centerlines in 2D and 3D

This work introduces a new approach for joint detection of centerlines b...

Please sign up or login with your details

Forgot password? Click here to reset