Submatrix localization via message passing

10/30/2015 ∙ by Bruce Hajek, et al. ∙ 0

The principal submatrix localization problem deals with recovering a K× K principal submatrix of elevated mean μ in a large n× n symmetric matrix subject to additive standard Gaussian noise. This problem serves as a prototypical example for community detection, in which the community corresponds to the support of the submatrix. The main result of this paper is that in the regime Ω(√(n)) ≤ K ≤ o(n), the support of the submatrix can be weakly recovered (with o(K) misclassification errors on average) by an optimized message passing algorithm if λ = μ^2K^2/n, the signal-to-noise ratio, exceeds 1/e. This extends a result by Deshpande and Montanari previously obtained for K=Θ(√(n)). In addition, the algorithm can be extended to provide exact recovery whenever information-theoretically possible and achieve the information limit of exact recovery as long as K ≥n/ n (1/8e + o(1)). The total running time of the algorithm is O(n^2 n). Another version of the submatrix localization problem, known as noisy biclustering, aims to recover a K_1× K_2 submatrix of elevated mean μ in a large n_1× n_2 Gaussian matrix. The optimized message passing algorithm and its analysis are adapted to the bicluster problem assuming Ω(√(n_i)) ≤ K_i ≤ o(n_i) and K_1 K_2. A sharp information-theoretic condition for the weak recovery of both clusters is also identified.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.