Mining Points of Interest via Address Embeddings: An Unsupervised Approach

09/09/2021
by   Abhinav Ganesan, et al.
0

Digital maps are commonly used across the globe for exploring places that users are interested in, commonly referred to as points of interest (PoI). In online food delivery platforms, PoIs could represent any major private compounds where customers could order from such as hospitals, residential complexes, office complexes, educational institutes and hostels. In this work, we propose an end-to-end unsupervised system design for obtaining polygon representations of PoIs (PoI polygons) from address locations and address texts. We preprocess the address texts using locality names and generate embeddings for the address texts using a deep learning-based architecture, viz. RoBERTa, trained on our internal address dataset. The PoI candidates are identified by jointly clustering the anonymised customer phone GPS locations (obtained during address onboarding) and the embeddings of the address texts. The final list of PoI polygons is obtained from these PoI candidates using novel post-processing steps. This algorithm identified 74.8 those obtained using the Mummidi-Krumm baseline algorithm run on our internal dataset. The proposed algorithm achieves a median area precision of 98 median area recall of 8 the recall of the algorithmic polygons, we post-process them using building footprint polygons from the OpenStreetMap (OSM) database. The post-processing algorithm involves reshaping the algorithmic polygon using intersecting polygons and closed private roads from the OSM database, and accounting for intersection with public roads on the OSM database. We achieve a median area recall of 70 on these post-processed polygons.

READ FULL TEXT

page 9

page 17

page 18

research
06/05/2022

DeeprETA: An ETA Post-processing System at Scale

Estimated Time of Arrival (ETA) plays an important role in delivery and ...
research
04/15/2021

Effect of Post-processing on Contextualized Word Representations

Post-processing of static embedding has beenshown to improve their perfo...
research
03/12/2021

A Simple Post-Processing Technique for Improving Readability Assessment of Texts using Word Mover's Distance

Assessing the proper difficulty levels of reading materials or texts in ...
research
02/09/2021

RMOPP: Robust Multi-Objective Post-Processing for Effective Object Detection

Over the last few decades, many architectures have been developed that h...
research
05/21/2020

The Frankfurt Latin Lexicon: From Morphological Expansion and Word Embeddings to SemioGraphs

In this article we present the Frankfurt Latin Lexicon (FLL), a lexical ...
research
07/03/2023

Estimating Post-OCR Denoising Complexity on Numerical Texts

Post-OCR processing has significantly improved over the past few years. ...

Please sign up or login with your details

Forgot password? Click here to reset