missSBM: An R Package for Handling Missing Values in the Stochastic Block Model

06/28/2019
by   Timothée Tabouy, et al.
0

The Stochastic Block Model (SBM) is a popular probabilistic model for random graph. It is commonly used to perform clustering on network data by aggregating nodes that share similar connectivity patterns into blocks. When fitting an SBM to a network which is partially observed, it is important to account for the underlying process that originates the missing values, otherwise the inference may be biased. This paper introduces missSBM, an R-package fitting the SBM when the network is partially observed, i.e. the adjacency matrix contains not only 1 or 0 encoding presence or absence of edges but also NA encoding missing information between pairs of nodes. It implements a series of algorithms for the binary SBM, with the possibility of accounting for covariates if needed, by performing variational inference for several sampling mechanisms, the methodology of which is detailed in Tabouy, Barbillon, and Chiquet (2019). Our implementation automatically explores different block numbers to select the most relevant according to the Integrated Classification Likelihood (ICL) criterion. The ICL criterion can also help to determine which sampling mechanism fits the best the data. Finally, missSBM can be used to perform imputation of missing entries in the adjacency matrix. We illustrate the package on a network data set consisting in interactions between blogs sampled during the French presidential election in 2007.

READ FULL TEXT
research
07/26/2018

Block models for multipartite networks.Applications in ecology and ethnobiology

Modeling relations between individuals is a classical question in social...
research
03/28/2019

Consistency and Asymptotic Normality of Stochastic Block Models Estimators from Sampled Data

Statistical analysis of network is an active research area and the liter...
research
05/12/2020

Functions and eigenvectors of partially known matrices with applications to network analysis

Matrix functions play an important role in applied mathematics. In netwo...
research
12/23/2019

Missing data analysis and imputation via latent Gaussian Markov random fields

In this paper we recast the problem of missing values in the covariates ...
research
02/16/2022

Self-Organizing Maps for Exploration of Partially Observed Data and Imputation of Missing Values

The self-organizing map is an unsupervised neural network which is widel...
research
09/19/2022

Embedded Topics in the Stochastic Block Model

Communication networks such as emails or social networks are now ubiquit...
research
11/05/2021

Optimality of variational inference for stochastic block model with missing links

Variational methods are extremely popular in the analysis of network dat...

Please sign up or login with your details

Forgot password? Click here to reset