Learning Neighborhood Representation from Multi-Modal Multi-Graph: Image, Text, Mobility Graph and Beyond

05/06/2021
by   Tianyuan Huang, et al.
19

Recent urbanization has coincided with the enrichment of geotagged data, such as street view and point-of-interest (POI). Region embedding enhanced by the richer data modalities has enabled researchers and city administrators to understand the built environment, socioeconomics, and the dynamics of cities better. While some efforts have been made to simultaneously use multi-modal inputs, existing methods can be improved by incorporating different measures of 'proximity' in the same embedding space - leveraging not only the data that characterizes the regions (e.g., street view, local businesses pattern) but also those that depict the relationship between regions (e.g., trips, road network). To this end, we propose a novel approach to integrate multi-modal geotagged inputs as either node or edge features of a multi-graph based on their relations with the neighborhood region (e.g., tiles, census block, ZIP code region, etc.). We then learn the neighborhood representation based on a contrastive-sampling scheme from the multi-graph. Specifically, we use street view images and POI features to characterize neighborhoods (nodes) and use human mobility to characterize the relationship between neighborhoods (directed edges). We show the effectiveness of the proposed methods with quantitative downstream tasks as well as qualitative analysis of the embedding space: The embedding we trained outperforms the ones using only unimodal data as regional inputs.

READ FULL TEXT

page 1

page 6

research
01/29/2020

Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding

Understanding intrinsic patterns and predicting spatiotemporal character...
research
01/04/2023

Detecting Neighborhood Gentrification at Scale via Street-level Visual Data

Neighborhood gentrification plays a significant role in shaping the soci...
research
08/20/2018

Learning from #Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods

Massive tourism is becoming a big problem for some cities, such as Barce...
research
08/22/2023

Ceci n'est pas une pomme: Adversarial Illusions in Multi-Modal Embeddings

Multi-modal encoders map images, sounds, texts, videos, etc. into a sing...
research
03/08/2022

Multi-Modal Mixup for Robust Fine-tuning

Pre-trained large-scale models provide a transferable embedding, and the...
research
06/07/2023

On the Generalization of Multi-modal Contrastive Learning

Multi-modal contrastive learning (MMCL) has recently garnered considerab...
research
07/26/2023

Plug and Pray: Exploiting off-the-shelf components of Multi-Modal Models

The rapid growth and increasing popularity of incorporating additional m...

Please sign up or login with your details

Forgot password? Click here to reset