Submatrix localization via message passing

by   Bruce Hajek, et al.

The principal submatrix localization problem deals with recovering a K× K principal submatrix of elevated mean μ in a large n× n symmetric matrix subject to additive standard Gaussian noise. This problem serves as a prototypical example for community detection, in which the community corresponds to the support of the submatrix. The main result of this paper is that in the regime Ω(√(n)) ≤ K ≤ o(n), the support of the submatrix can be weakly recovered (with o(K) misclassification errors on average) by an optimized message passing algorithm if λ = μ^2K^2/n, the signal-to-noise ratio, exceeds 1/e. This extends a result by Deshpande and Montanari previously obtained for K=Θ(√(n)). In addition, the algorithm can be extended to provide exact recovery whenever information-theoretically possible and achieve the information limit of exact recovery as long as K ≥n/ n (1/8e + o(1)). The total running time of the algorithm is O(n^2 n). Another version of the submatrix localization problem, known as noisy biclustering, aims to recover a K_1× K_2 submatrix of elevated mean μ in a large n_1× n_2 Gaussian matrix. The optimized message passing algorithm and its analysis are adapted to the bicluster problem assuming Ω(√(n_i)) ≤ K_i ≤ o(n_i) and K_1 K_2. A sharp information-theoretic condition for the weak recovery of both clusters is also identified.


page 1

page 2

page 3

page 4


Recovering a Hidden Community Beyond the Spectral Limit in O(|E| ^*|V|) Time

Community detection is considered for a stochastic block model graph of ...

A Message Passing based Adaptive PDA Algorithm for Robust Radio-based Localization and Tracking

We present a message passing algorithm for localization and tracking in ...

Robust Group Synchronization via Cycle-Edge Message Passing

We propose a general framework for group synchronization with adversaria...

Inference via Message Passing on Partially Labeled Stochastic Block Models

We study the community detection and recovery problem in partially-label...

Statistical and computational thresholds for the planted k-densest sub-hypergraph problem

Recovery a planted signal perturbed by noise is a fundamental problem in...

Rank-one matrix estimation with groupwise heteroskedasticity

We study the problem of estimating a rank-one matrix from Gaussian obser...

Distributed Reconstruction of Noisy Pooled Data

In the pooled data problem we are given a set of n agents, each of which...