Statistical Problems with Planted Structures: Information-Theoretical and Computational Limits

05/31/2018
by   Yihong Wu, et al.
0

Over the past few years, insights from computer science, statistical physics, and information theory have revealed phase transitions in a wide array of high-dimensional statistical problems at two distinct thresholds: One is the information-theoretical (IT) threshold below which the observation is too noisy so that inference of the ground truth structure is impossible regardless of the computational cost; the other is the computational threshold above which inference can be performed efficiently, i.e., in time that is polynomial in the input size. In the intermediate regime, inference is information-theoretically possible, but conjectured to be computationally hard. This article provides a survey of the common techniques for determining the sharp IT and computational limits, using community detection and submatrix detection as illustrating examples. For IT limits, we discuss tools including the first and second moment method for analyzing the maximal likelihood estimator, information-theoretic methods for proving impossibility results using rate-distortion theory, and methods originated from statistical physics such as interpolation method. To investigate computational limits, we describe a common recipe to construct a randomized polynomial-time reduction scheme that approximately maps instances of the planted clique problem to the problem of interest in total variation distance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset