DeepAI AI Chat
Log In Sign Up

Bounding Boxes Are All We Need: Street View Image Classification via Context Encoding of Detected Buildings

by   Kun Zhao, et al.

Street view images have been increasingly used in tasks like urban land use classification and urban functional zone portraying. Street view image classification is difficult because the class labels such as commercial area, are concepts with higher abstract level compared to general visual tasks. Therefore, classification models using only visual features often fail to achieve satisfactory performance. We believe that the efficient representation of significant objects and their context relations in street view images are the keys to solve this problem. In this paper, a novel approach based on a detector-encoder-classifier framework is proposed. Different from common image-level end-to-end models, our approach does not use visual features of the whole image directly. The proposed framework obtains the bounding boxes of buildings in street view images from a detector. Their contextual information such as building classes and positions are then encoded into metadata and finally classified by a recurrent neural network (RNN). To verify our approach, we made a dataset of 19,070 street view images and 38,857 buildings based on the BIC_GSV dataset through a combination of automatic label acquisition and expert annotation. The dataset can be used not only for street view image classification aiming at urban land use analysis, but also for multi-class building detection. Experiments show that the proposed approach achieves a 12.65 the models based on end-to-end convolutional neural network (CNN). Our code and dataset are available at


page 1

page 4

page 6

page 12

page 13

page 17

page 18

page 19


Building Instance Classification Using Street View Images

Land-use classification based on spaceborne or aerial remote sensing ima...

Building Facade Parsing R-CNN

Building facade parsing, which predicts pixel-level labels for building ...

Holistic Multi-View Building Analysis in the Wild with Projection Pooling

We address six different classification tasks related to fine-grained bu...

Fast and Regularized Reconstruction of Building Façades from Street-View Images using Binary Integer Programming

Regularized arrangement of primitives on building façades to aligned loc...

Take a Look Around: Using Street View and Satellite Images to Estimate House Prices

When an individual purchases a home, they simultaneously purchase its st...

Automated Building Image Extraction from 360-degree Panoramas for Post-Disaster Evaluation

After a disaster, teams of structural engineers collect vast amounts of ...