Information Recovery from Pairwise Measurements
This paper is concerned with jointly recovering n node-variables { x_i}_1≤ i≤ n from a collection of pairwise difference measurements. Imagine we acquire a few observations taking the form of x_i-x_j; the observation pattern is represented by a measurement graph G with an edge set E such that x_i-x_j is observed if and only if (i,j)∈E. To account for noisy measurements in a general manner, we model the data acquisition process by a set of channels with given input/output transition measures. Employing information-theoretic tools applied to channel decoding problems, we develop a unified framework to characterize the fundamental recovery criterion, which accommodates general graph structures, alphabet sizes, and channel transition measures. In particular, our results isolate a family of minimumchannel divergence measures to characterize the degree of measurement corruption, which together with the size of the minimum cut of G dictates the feasibility of exact information recovery. For various homogeneous graphs, the recovery condition depends almost only on the edge sparsity of the measurement graph irrespective of other graphical metrics; alternatively, the minimum sample complexity required for these graphs scales like minimum sample complexity n n/Hel_1/2^ for certain information metric Hel_1/2^ defined in the main text, as long as the alphabet size is not super-polynomial in n. We apply our general theory to three concrete applications, including the stochastic block model, the outlier model, and the haplotype assembly problem. Our theory leads to order-wise tight recovery conditions for all these scenarios.
READ FULL TEXT