Reflexive Regular Equivalence for Bipartite Data

02/16/2017
by   Aaron Gerow, et al.
0

Bipartite data is common in data engineering and brings unique challenges, particularly when it comes to clustering tasks that impose on strong structural assumptions. This work presents an unsupervised method for assessing similarity in bipartite data. Similar to some co-clustering methods, the method is based on regular equivalence in graphs. The algorithm uses spectral properties of a bipartite adjacency matrix to estimate similarity in both dimensions. The method is reflexive in that similarity in one dimension is used to inform similarity in the other. Reflexive regular equivalence can also use the structure of transitivities -- in a network sense -- the contribution of which is controlled by the algorithm's only free-parameter, α. The method is completely unsupervised and can be used to validate assumptions of co-similarity, which are required but often untested, in co-clustering analyses. Three variants of the method with different normalizations are tested on synthetic data. The method is found to be robust to noise and well-suited to asymmetric co-similar structure, making it particularly informative for cluster analysis and recommendation in bipartite data of unknown structure. In experiments, the convergence and speed of the algorithm are found to be stable for different levels of noise. Real-world data from a network of malaria genes are analyzed, where the similarity produced by the reflexive method is shown to out-perform other measures' ability to correctly classify genes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2018

A spectral version of the Moore problem for bipartite regular graphs

Let b(k,θ) be the maximum order of a connected bipartite k-regular graph...
research
04/14/2023

Strong Consistency Guarantees for Clustering High-Dimensional Bipartite Graphs with the Spectral Method

In this work, we focus on the Bipartite Stochastic Block Model (BiSBM), ...
research
04/02/2020

Motif-Based Spectral Clustering of Weighted Directed Networks

Clustering is an essential technique for network analysis, with applicat...
research
02/17/2022

Occupation similarity through bipartite graphs

Similarity between occupations is a crucial piece of information when ma...
research
07/05/2014

Homophilic Clustering by Locally Asymmetric Geometry

Clustering is indispensable for data analysis in many scientific discipl...
research
10/29/2022

A Two Step Approach to Weighted Bipartite Link Recommendations

Many real world person-person or person-product relationships can be mod...
research
06/24/2021

Fund2Vec: Mutual Funds Similarity using Graph Learning

Identifying similar mutual funds with respect to the underlying portfoli...

Please sign up or login with your details

Forgot password? Click here to reset