Addressing the Invisible: Street Address Generation for Developing Countries with Deep Learning

11/10/2018 ∙ by Ilke Demir, et al. ∙ 0

More than half of the world's roads lack adequate street addressing systems. Lack of addresses is even more visible in daily lives of people in developing countries. We would like to object to the assumption that having an address is a luxury, by proposing a generative address design that maps the world in accordance with streets. The addressing scheme is designed considering several traditional street addressing methodologies employed in the urban development scenarios around the world. Our algorithm applies deep learning to extract roads from satellite images, converts the road pixel confidences into a road network, partitions the road network to find neighborhoods, and labels the regions, roads, and address units using graph- and proximity-based algorithms. We present our results on a sample US city, and several developing cities, compare travel times of users using current ad hoc and new complete addresses, and contrast our addressing solution to current industrial and open geocoding alternatives.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Street addresses enhance precise physical presence and effectively increase the connectivity all around the world. Currently 75% of the roads in the world are not mapped w3w , and this number is increasing in developing countries. United Nations claims to have 4 billion invisible people in the world due to the addressing problem in developing countries. This problem is even more critical in disaster zones, areas with limited resources, and geographically challenging locations. As the remote sensing technology has been significantly improving over the past decade, the organic growth of urban and suburban areas outruns the deployment of addressing schemes. We want to address the invisible. We imagine an algorithm that automatically creates meaningful addresses for unmapped areas, areas with no street name or address.

In order to realize our goal, we designed and implemented a generative addressing system to bridge the gap between grid-based digital addressing schemes and traditional street addresses. We (i) design a physical addressing scheme, which is linear, hierarchical, flexible, intuitive, perceptible, robust, (ii) propose a segmentation method to obtain road segments and regions from satellite imagery, using deep learning and graph-partitioning, (iii) implement a labeling method to name urban elements based on current addressing schemes and distance fields, and (iv) develop ready-to-deploy prototype applications supporting forward and inverse geoqueries.

2 Related Work

Geocoding approaches: Popular geocoding solutions are either not in human-readable form (e.g., GooglePlaceID and OkHi), or tend to de-correlate from the topological structure (e.g., w3w , Zippr and MapTags), and they lack essential properties of a street addressing system, such as linearity and hierarchy. Those properties as well as their human intersection become even more crucial in developing countries  mitblog .

Procedural generation: Automating the generation of maps is extensively studied in the urban procedural modeling world  chen2008 ; vanegas2012 ; parish2001 ; aliaga2008 ; sun2002 , creating detailed and structurally realistic models, but none of them are applicable as a synthesis method on real world. On the other hand, inverse procedural modeling approaches aliaga2016 process real-world data for generative representations. We follow this last path and rely on satellite imagery as the input of our synthesis approach.

Remote sensing: Several approaches automate the extraction of geospatial information using already existing data resources  wang2013 ; skoumas2016 ; mattyus2015 ; deeproadmapper ; zeng2017

. Similar approaches extract road networks using neural networks 

wang2015 ; zhao2012 ; li2016 ; xu2014 ; peteri2003 ; poullis2008 ; demir18 .Processing the road topology has also been studied by clustering and graph partitioning approaches wegner2013 ; alshehhi2017 ; anwar2017 .

Addresses around the world: Traditional street addresses and names are usually the result of cultural dynamics, politics, economies, and other long-term processes adopted by urban authorities. We conducted some research on the current addressing methods such as the London postal code system London , South Korea street naming, Berlin house numbering, and more streetbook .

3 The Street Address Format

Semantic properties emphasize user-friendly features, implying linear, hierarchical, universal, and memorable addresses. Structural properties enable the format of street addresses to be computer friendly, necessitating linear, hierarchical, compressible, robust, extendible, and queryable codes. Following these design principles (Figure 1), the last field indicates version, the fourth field contains the country and state information when applicable, preceded by the city information in the third field. The second field contains the road name, which starts with the region label, followed by the road number. Lastly, the first field is composed of the meter marker along the road and the block letter from the road, animating the house number and apartment number consecutively.

Figure 1: Street Address Format: house number, road name, city, state (if applicable), country, followed by the version year.

4 Our Generative Addressing System

The segmentation step extracts roads, breaks them into road segments and clusters them into regions. The labeling step names the regions, road segments, and place markers and assigns block letters to individual addressable units. Details of each module can be found in our extended journal paper ijgi .

Predicting Road Pixels: The first step of our approach creates binary road prediction images from three channel satellite images of 0.5 m resolution and of size 19 K * 19 K. Both training and testing are done with patches of . The training set includes 4–16 tiles per country, and the test set includes all the rest of the tiles, manually spatially distributed to sample all areas, keeping the ratios mostly at 70% to 30%. We use a modified version of SegNet segnet , which consists of the first 13 convolutional layers of the VGG16 network for the encoder, having a corresponding decoder layer for each encoder. We modify the last soft-max layer to change the multi-class structure to have binary classes for road detection, by substituting it with a convolutional layer. We experimented with architectures (Figure 2) such as VGG vgg , U-Net unet , and ResNet resnet variations; however, we achieved the best result with the SegNet model trained on dense and diverse tiles, resulting in 72.6% precision and 57.2% recall. We also experimented with DeepLab deeplab

variations and achieved 75.4% precision and 75.9% recall; however the model showed signs of overfitting after epoch 30.

Figure 2: Comparison of NN Models. An example (a) satellite image and (b) ground truth; and road predictions using (c) VGG; (d) U-Net; (e) ResNet50; (f) ResNet101; (g) SegNet; and (h) DeepLab.

Creating the road graph:

The post-processing includes binarizing the image with thresholding, running a depth-first search to join connected roads using the confidences, applying an orientation-based adaptive median filtering on the road end points, bucketing the road segments based on their orientations, and processing intersections. Then we convert the road segments into a road graph, nodes as intersections, edges as road segments, and weights as the segment distance.

Defining regions: We partition the road graph into communities that have the maximum interconnectivity and minimum intraconnectivity. We experimented with normalized min-cut mincut , Newman–Girvan newman , and optimal modularity-based modularity graph partitioning approaches, concluding on mincut , being accurate and significantly more efficient choice.

Labeling regions, roads, and address units: We compute the most dense region by averaging number of roads per unit area, and we name this region “CA” for the city center. We divide all other regions into four orientations: N(orth), S(outh), W(est), and E(ast)

and assign letters in that specific order, following the spiral pattern of the London post code system. The roads in each region are divided into two main directions: odd for north–south bound, and even for east–west bound, then numbered according to their order. For each road segment, we place a virtual meter marker every 5 m, on the left and right sides of the roads by even and odd numbering. Finally, we compute a distance field of the roads and discretize that field by a 5 m step size, as the block letter.

5 Results and Applications

Our system is written in Python and C++; the implementation is on the CPU (except road prediction). We use Chainer chainer , networkx, and sci-kit libraries. We used our approach to process more than 10 cities, totaling up to more than 16 K km. The source code to convert .osm files and geotiffs to street addresses is available on our repository111

We compare the road predictions with ground truth for the extracted roads of an unmapped suburban area. Our SegNet model and post-processing approach were able to learn 90.51% of the roads. This success ratio was close to 80% on average per city.

Figure 3: We show (a) input satellite tile; (b) extracted roads; (c) created regions; and (d) generated map; compared to (e) OpenStreetMap (OSM) of the same area.

We evaluate the usefulness of our generative maps with some treasure hunt-like user experiences. We compared the travel times using the old and new addressing schemes. Overall, the travel times decreased by 21.7%, with our system producing a 52.4 s improvement on average and decreasing the last mile of activity, proving the accuracy of our addresses.

We compare street segments dictated by the traditional addresses and our generative addresses (Figure 3). Comparing the road segments, we accomplished extracting 95% of the roads in that particular city tile. Comparing the addresses, although traditional addresses are more established, our addresses are easier to remember and support intuitive self-location and navigation.

Figure 4: Satellite image, extracted roads, labeled regions and roads, and meter markers and blocks of three example developing cities.

However, keeping the motivation of providing street addresses to the approximately 4 billion unconnected people, our results in fact shine for developing countries. Figure 4 shows our generative maps in the same format, on three different unmapped developing cities. We accomplished automatically addressing more than 80% of the populated areas, which significantly improved map coverage.

We compare our maps to other popular addressing solutions. For the same point on earth, w3w outputs three random words, Google Maps contains some unlabelled roads; however it outputs Green Park for a couple of kilometers around the point. OSM does not contain roads, and the point can be reached only by its latitude and longitude. However, our approach extract the roads almost completely and assign a unique address as 715D.NE127.Dhule.MhIn.

6 Conclusions

We demonstrated that deep learning can be used in a world-wide system for automatically creating street addresses from satellite imagery for developing countries in the world. Physically connecting the unconnected populations should increase the economic, juridical, and life-sustaining involvement of people all around the world. It improves the outreach of businesses and the economy, as well as the accuracy and efficiency of providing first aid in disaster zones. More evaluation, results, and details about determinism and complexity analysis of submodules, constructing a global address space, versioning and updating, handling missing city boundaries, overflowing regions, and 3D roads can be found in our previously published works ijgi ; cvpr17 .


  • (1) What is the right addressing scheme for india? Accessed: 2017-12-01.
  • (2) D. G. Aliaga, I. Demir, B. Benes, and M. Wand. Inverse procedural modeling of 3d models for virtual worlds. In ACM SIGGRAPH 2016 Courses, SIGGRAPH ’16, pages 16:1–16:316, New York, NY, USA, 2016. ACM.
  • (3) D. G. Aliaga, C. A. Vanegas, and B. Benes. Interactive example-based urban layout synthesis. ACM Trans. Graph., 27(5):160:1–160:10, Dec. 2008.
  • (4) R. Alshehhi and P. R. Marpu. Hierarchical graph-based segmentation for extracting road networks from high-resolution satellite images. {ISPRS} Journal of Photogrammetry and Remote Sensing, 126:245 – 260, 2017.
  • (5) T. Anwar, C. Liu, H. L. Vu, and C. Leckie. Partitioning road networks using density peak graphs: Efficiency vs. accuracy. Information Systems, 64:22 – 40, 2017.
  • (6) V. Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. CoRR, abs/1511.00561, 2015.
  • (7) V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, (10):P10008+, Oct. 2008.
  • (8) G. Chen, G. Esch, P. Wonka, P. Müller, and E. Zhang. Interactive procedural street modeling. ACM Trans. Graph., 27(3):103:1–103:10, Aug. 2008.
  • (9) L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP(99):1–1, 2017.
  • (10) I. Demir, F. Hughes, A. Raj, K. Dhruv, S. M. Muddala, S. Garg, B. Doo, and R. Raskar. Generative street addresses from satellite imagery. ISPRS International Journal of Geo-Information, 7(3), 2018.
  • (11) I. Demir, F. Hughes, A. Raj, K. Tsourides, D. Ravichandran, S. Murthy, K. Dhruv, S. Garg, J. Malhotra, B. Doo, G. Kermani, and R. Raskar. Robocodes: Towards generative street addresses from satellite imagery. In

    IEEE International Conference on Computer Vision and Pattern Recognition Workshops

    , July 2017.
  • (12) I. Demir, K. Koperski, D. Lindenbaum, G. Pang, J. Huang, S. Basu, F. Hughes, D. Tuia, and R. Raskar. Deepglobe 2018: A challenge to parse the earth through satellite images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018.
  • (13) C. Farvacque-Vitković and W. Bank. Street addressing and the management of cities / Catherine Farvacque-Vitkovic … [et al.]. World Bank Washington, D.C, 2005.
  • (14) M. Girvan and M. E. Newman. Community structure in social and biological networks. Proceedings of the national academy of sciences, 99(12):7821–7826, 2002.
  • (15) K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, June 2016.
  • (16) G. R. Jones. Human friendly coordinates. GeoInformatics, 18(5):10–12, Jul 2015.
  • (17) P. Li, Y. Zang, C. Wang, J. Li, M. Cheng, L. Luo, and Y. Yu. Road network extraction via deep learning and line integral convolution. In 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pages 1599–1602, July 2016.
  • (18) G. Mattyus, W. Luo, and R. Urtasun. Deeproadmapper: Extracting road topology from aerial images. In The IEEE International Conference on Computer Vision (ICCV), Oct 2017.
  • (19) G. Mattyus, S. Wang, S. Fidler, and R. Urtasun. Enhancing road maps by parsing aerial images around the world. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 1689–1697, Dec 2015.
  • (20) Y. I. H. Parish and P. Müller. Procedural modeling of cities. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’01, pages 301–308, New York, NY, USA, 2001. ACM.
  • (21) R. Peteri, J. Celle, and T. Ranchin. Detection and extraction of road networks from high resolution satellite images. In Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429), volume 1, pages I–301–4 vol.1, Sept 2003.
  • (22) C. Poullis, S. You, and U. Neumann. A vision-based system for automatic detection and extraction of road networks. In 2008 IEEE Workshop on Applications of Computer Vision, pages 1–8, Jan 2008.
  • (23) O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. CoRR, abs/1505.04597, 2015.
  • (24) K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
  • (25) G. Skoumas, D. Pfoser, A. Kyrillidis, and T. Sellis.

    Location estimation using crowdsourced spatial relations.

    ACM Trans. Spatial Algorithms Syst., 2(2):5:1–5:23, June 2016.
  • (26) J. Sun, X. Yu, G. Baciu, and M. Green. Template-based generation of road networks for virtual city modeling. In Proceedings of the ACM Symposium on Virtual Reality Software and Technology, VRST ’02, pages 33–40, New York, NY, USA, 2002. ACM.
  • (27) The City of London. London postal code system, 2016.
  • (28) S. Tokui, K. Oono, S. Hido, and J. Clayton. Chainer: a next-generation open source framework for deep learning. In

    Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Twenty-ninth Annual Conference on Neural Information Processing Systems (NIPS)

    , 2015.
  • (29) C. A. Vanegas, T. Kelly, B. Weber, J. Halatsch, D. G. Aliaga, and P. Muller. Procedural generation of parcels in urban modeling. Comput. Graph. Forum, 31(2pt3):681–690, May 2012.
  • (30) J. Wang, J. Song, M. Chen, and Z. Yang. Road network extraction: a neural-dynamic framework based on deep learning and a finite state machine. International Journal of Remote Sensing, 36(12):3144–3169, 2015.
  • (31) Y. Wang, X. Liu, H. Wei, G. Forman, and Y. Zhu. Crowdatlas: Self-updating maps for cloud and personal use. In Proceeding of the 11th Annual International Conference on Mobile Systems, Applications, and Services, MobiSys ’13, pages 469–470, New York, NY, USA, 2013. ACM.
  • (32) J. D. Wegner, J. A. Montoya-Zegarra, and K. Schindler. A higher-order crf model for road network extraction. In 2013 IEEE Conference on Computer Vision and Pattern Recognition, pages 1698–1705, June 2013.
  • (33) L. Xu, T. Jun, Y. Xiang, C. JianJie, and G. LiQian. The rapid method for road extraction from high-resolution satellite images based on usm algorithm. In 2012 International Conference on Image Analysis and Signal Processing, pages 1–6, Nov 2012.
  • (34) S. X. Yu and J. Shi.

    Multiclass spectral clustering.

    In Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2, ICCV ’03, pages 313–, Washington, DC, USA, 2003. IEEE Computer Society.
  • (35) D. Zeng, T. Zhang, R. Fang, W. Shen, and Q. Tian. Neighborhood geometry based feature matching for geostationary satellite remote sensing image. Neurocomputing, 236:65 – 72, 2017. Good Practices in Multimedia Modeling.
  • (36) J. Zhao and S. You. Road network extraction from airborne lidar data using scene context. In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pages 9–16, June 2012.