
Information Limits for Recovering a Hidden Community
We study the problem of recovering a hidden community of cardinality K f...
read it

Hidden Hamiltonian Cycle Recovery via Linear Programming
We introduce the problem of hidden Hamiltonian cycle recovery, where the...
read it

A TheoryBased Evaluation of Nearest Neighbor Models Put Into Practice
In the knearest neighborhood model (kNN), we are given a set of points...
read it

Direct Estimation of Information Divergence Using Nearest Neighbor Ratios
We propose a direct estimation method for Rényi and fdivergence measure...
read it

Metric recovery from directed unweighted graphs
We analyze directed, unweighted graphs obtained from x_i∈R^d by connecti...
read it

Pruning nearest neighbor cluster trees
Nearest neighbor (kNN) graphs are widely used in machine learning and d...
read it

Recovering a Single Community with Side Information
We study the effect of the quality and quantity of side information on t...
read it
Consistent recovery threshold of hidden nearest neighbor graphs
Motivated by applications such as discovering strong ties in social networks and assembling genome subsequences in biology, we study the problem of recovering a hidden 2knearest neighbor (NN) graph in an nvertex complete graph, whose edge weights are independent and distributed according to P_n for edges in the hidden 2kNN graph and Q_n otherwise. The special case of Bernoulli distributions corresponds to a variant of the WattsStrogatz smallworld graph. We focus on two types of asymptotic recovery guarantees as n→∞: (1) exact recovery: all edges are classified correctly with probability tending to one; (2) almost exact recovery: the expected number of misclassified edges is o(nk). We show that the maximum likelihood estimator achieves (1) exact recovery for 2 < k < n^o(1) if lim inf2α_n/log n>1; (2) almost exact recovery for 1 < k < o( log n/loglog n) if lim infkD(P_nQ_n)/log n>1, where α_n 2 log∫√(d P_n d Q_n) is the Rényi divergence of order 1/2 and D(P_nQ_n) is the KullbackLeibler divergence. Under mild distributional assumptions, these conditions are shown to be informationtheoretically necessary for any algorithm to succeed. A key challenge in the analysis is the enumeration of 2kNN graphs that differ from the hidden one by a given number of edges.
READ FULL TEXT
Comments
There are no comments yet.