Consistent recovery threshold of hidden nearest neighbor graphs

11/18/2019
by   Jian Ding, et al.
0

Motivated by applications such as discovering strong ties in social networks and assembling genome subsequences in biology, we study the problem of recovering a hidden 2k-nearest neighbor (NN) graph in an n-vertex complete graph, whose edge weights are independent and distributed according to P_n for edges in the hidden 2k-NN graph and Q_n otherwise. The special case of Bernoulli distributions corresponds to a variant of the Watts-Strogatz small-world graph. We focus on two types of asymptotic recovery guarantees as n→∞: (1) exact recovery: all edges are classified correctly with probability tending to one; (2) almost exact recovery: the expected number of misclassified edges is o(nk). We show that the maximum likelihood estimator achieves (1) exact recovery for 2 < k < n^o(1) if lim inf2α_n/log n>1; (2) almost exact recovery for 1 < k < o( log n/loglog n) if lim infkD(P_n||Q_n)/log n>1, where α_n -2 log∫√(d P_n d Q_n) is the Rényi divergence of order 1/2 and D(P_n||Q_n) is the Kullback-Leibler divergence. Under mild distributional assumptions, these conditions are shown to be information-theoretically necessary for any algorithm to succeed. A key challenge in the analysis is the enumeration of 2k-NN graphs that differ from the hidden one by a given number of edges.

READ FULL TEXT
research
09/25/2015

Information Limits for Recovering a Hidden Community

We study the problem of recovering a hidden community of cardinality K f...
research
04/15/2018

Hidden Hamiltonian Cycle Recovery via Linear Programming

We introduce the problem of hidden Hamiltonian cycle recovery, where the...
research
10/11/2018

A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice

In the k-nearest neighborhood model (k-NN), we are given a set of points...
research
02/22/2022

Random Graph Matching in Geometric Models: the Case of Complete Graphs

This paper studies the problem of matching two complete graphs with edge...
research
02/17/2017

Direct Estimation of Information Divergence Using Nearest Neighbor Ratios

We propose a direct estimation method for Rényi and f-divergence measure...
research
09/07/2022

Planted matching problems on random hypergraphs

We consider the problem of inferring a matching hidden in a weighted ran...
research
09/05/2018

Recovering a Single Community with Side Information

We study the effect of the quality and quantity of side information on t...

Please sign up or login with your details

Forgot password? Click here to reset