Chi-squared Amplification: Identifying Hidden Hubs

08/12/2016
by   Ravi Kannan, et al.
0

We consider the following general hidden hubs model: an n × n random matrix A with a subset S of k special rows (hubs): entries in rows outside S are generated from the probability distribution p_0 ∼ N(0,σ_0^2); for each row in S, some k of its entries are generated from p_1 ∼ N(0,σ_1^2), σ_1>σ_0, and the rest of the entries from p_0. The problem is to identify the high-degree hubs efficiently. This model includes and significantly generalizes the planted Gaussian Submatrix Model, where the special entries are all in a k × k submatrix. There are two well-known barriers: if k≥ c√(n n), just the row sums are sufficient to find S in the general model. For the submatrix problem, this can be improved by a √( n) factor to k > c√(n) by spectral methods or combinatorial methods. In the variant with p_0=± 1 (with probability 1/2 each) and p_1≡ 1, neither barrier has been broken. We give a polynomial-time algorithm to identify all the hidden hubs with high probability for k > n^0.5-δ for some δ >0, when σ_1^2>2σ_0^2. The algorithm extends to the setting where planted entries might have different variances each at least as large as σ_1^2. We also show a nearly matching lower bound: for σ_1^2 < 2σ_0^2, there is no polynomial-time Statistical Query algorithm for distinguishing between a matrix whose entries are all from N(0,σ_0^2) and a matrix with k=n^0.5-δ hidden hubs for any δ >0. The lower bound as well as the algorithm are related to whether the chi-squared distance of the two distributions diverges. At the critical value σ_1^2=2σ_0^2, we show that the general hidden hubs problem can be solved for k≥ c√(n)( n)^1/4, improving on the naive row sum-based method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2015

Improved Sum-of-Squares Lower Bounds for Hidden Clique and Hidden Submatrix Problems

Given a large data matrix A∈R^n× n, we consider the problem of determini...
research
02/12/2020

Complexity of Combinatorial Matrix Completion With Diameter Constraints

We thoroughly study a novel and still basic combinatorial matrix complet...
research
06/11/2023

Detection and Recovery of Hidden Submatrices

In this paper, we study the problems of detection and recovery of hidden...
research
04/29/2018

A QPTAS for Gapless MEC

We consider the problem Minimum Error Correction (MEC). A MEC instance i...
research
07/10/2019

Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications

In this paper we identify a new class of sparse near-quadratic random Bo...
research
07/06/2021

Distance Estimation Between Unknown Matrices Using Sublinear Projections on Hamming Cube

Using geometric techniques like projection and dimensionality reduction,...
research
10/20/2017

The satisfiability threshold for random linear equations

Let A be a random m× n matrix over the finite field F_q with precisely k...

Please sign up or login with your details

Forgot password? Click here to reset