Link Prediction in the Stochastic Block Model with Outliers

by   Solenne Gaucher, et al.

The Stochastic Block Model is a popular model for network analysis in the presence of community structure. However, in numerous examples, the assumptions underlying this classical model are put in default by the behaviour of a small number of outlier nodes such as hubs, nodes with mixed membership profiles, or corrupted nodes. In addition, real-life networks are likely to be incomplete, due to non-response or machine failures. We introduce a new algorithm to estimate the connection probabilities in a network, which is robust to both outlier nodes and missing observations. Under fairly general assumptions, this method detects the outliers, and achieves the best known error for the estimation of connection probabilities with polynomial computation cost. In addition, we prove sub-linear convergence of our algorithm. We provide a simulation study which demonstrates the good behaviour of the method in terms of outliers selection and prediction of the missing links.


page 1

page 2

page 3

page 4


Robust and computationally feasible community detection in the presence of arbitrary outlier nodes

Community detection, which aims to cluster N nodes in a given graph into...

Spectral clustering in the dynamic stochastic block model

In the present paper, we studied a Dynamic Stochastic Block Model (DSBM)...

Bipartite mixed membership stochastic blockmodel

Mixed membership problem for undirected network has been well studied in...

An Approach for Link Prediction in Directed Complex Networks based on Asymmetric Similarity-Popularity

Complex networks are graphs representing real-life systems that exhibit ...

Simple robust genomic prediction and outlier detection for a multi-environmental field trial

The aim of plant breeding trials is often to identify germplasms that ar...

Detecting Network Soft-failures with the Network Link Outlier Factor (NLOF)

In this paper, we describe and experimentally evaluate the performance o...

On the impact of outliers in loss reserving

The sensitivity of loss reserving techniques to outliers in the data or ...