Unsupervised Learning via Network-Aware Embeddings

Data clustering, the task of grouping observations according to their similarity, is a key component of unsupervised learning – with real world applications in diverse fields such as biology, medicine, and social science. Often in these fields the data comes with complex interdependencies between the dimensions of analysis, for instance the various characteristics and opinions people can have live on a complex social network. Current clustering methods are ill-suited to tackle this complexity: deep learning can approximate these dependencies, but not take their explicit map as the input of the analysis. In this paper, we aim at fixing this blind spot in the unsupervised learning literature. We can create network-aware embeddings by estimating the network distance between numeric node attributes via the generalized Euclidean distance. Differently from all methods in the literature that we know of, we do not cluster the nodes of the network, but rather its node attributes. In our experiments we show that having these network embeddings is always beneficial for the learning task; that our method scales to large networks; and that we can actually provide actionable insights in applications in a variety of fields such as marketing, economics, and political science. Our method is fully open source and data and code are available to reproduce all results in the paper.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2018

Constructing Graph Node Embeddings via Discrimination of Similarity Distributions

The problem of unsupervised learning node embeddings in graphs is one of...
research
11/03/2019

Attributed Sequence Embedding

Mining tasks over sequential data, such as clickstreams and gene sequenc...
research
08/09/2021

A Framework for Joint Unsupervised Learning of Cluster-Aware Embedding for Heterogeneous Networks

Heterogeneous Information Network (HIN) embedding refers to the low-dime...
research
12/03/2018

Online Graph-Adaptive Learning with Scalability and Privacy

Graphs are widely adopted for modeling complex systems, including financ...
research
10/03/2022

Review of Clustering Methods for Functional Data

Functional data clustering is to identify heterogeneous morphological pa...
research
07/18/2020

A new nature inspired modularity function adapted for unsupervised learning involving spatially embedded networks: A comparative analysis

Unsupervised machine learning methods can be of great help in many tradi...

Please sign up or login with your details

Forgot password? Click here to reset