node2bits: Compact Time- and Attribute-aware Node Representations for User Stitching

by   Di Jin, et al.

Identity stitching, the task of identifying and matching various online references (e.g., sessions over different devices and timespans) to the same user in real-world web services, is crucial for personalization and recommendations. However, traditional user stitching approaches, such as grouping or blocking, require quadratic pairwise comparisons between a massive number of user activities, thus posing both computational and storage challenges. Recent works, which are often application-specific, heuristically seek to reduce the amount of comparisons, but they suffer from low precision and recall. To solve the problem in an application-independent way, we take a heterogeneous network-based approach in which users (nodes) interact with content (e.g., sessions, websites), and may have attributes (e.g., location). We propose node2bits, an efficient framework that represents multi-dimensional features of node contexts with binary hashcodes. node2bits leverages feature-based temporal walks to encapsulate short- and long-term interactions between nodes in heterogeneous web networks, and adopts SimHash to obtain compact, binary representations and avoid the quadratic complexity for similarity search. Extensive experiments on large-scale real networks show that node2bits outperforms traditional techniques and existing works that generate real-valued embeddings by up to 5.16 taking only up to 1.56



There are no comments yet.


page 1

page 2

page 3

page 4


Relation-aware Heterogeneous Graph for User Profiling

User profiling has long been an important problem that investigates user...

Heterogeneous Attributed Network Embedding with Graph Convolutional Networks

Network embedding which assigns nodes in networks to lowdimensional repr...

HAHE: Hierarchical Attentive Heterogeneous Information Network Embedding

Given the intractability of large scale HIN, network embedding which lea...

A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning

Network Embedding has been widely studied to model and manage data in a ...

Search Efficient Binary Network Embedding

Traditional network embedding primarily focuses on learning a dense vect...

BL-MNE: Emerging Heterogeneous Social Network Embedding through Broad Learning with Aligned Autoencoder

Network embedding aims at projecting the network data into a low-dimensi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.