Tensor Clustering with Planted Structures: Statistical Optimality and Computational Limits

05/21/2020
by   Yuetian Luo, et al.
0

This paper studies the statistical and computational limits of high-order clustering with planted structures. We focus on two clustering models, constant high-order clustering (CHC) and rank-one higher-order clustering (ROHC), and study the methods and theory for testing whether a cluster exists (detection) and identifying the support of cluster (recovery). Specifically, we identify the sharp boundaries of signal-to-noise ratio for which CHC and ROHC detection/recovery are statistically possible. We also develop the tight computational thresholds: when the signal-to-noise ratio is below these thresholds, we prove that polynomial-time algorithms cannot solve these problems under the computational hardness conjectures of hypergraphic planted clique (HPC) detection and hypergraphic planted dense subgraph (HPDS) recovery. We also propose polynomial-time tensor algorithms that achieve reliable detection and recovery when the signal-to-noise ratio is above these thresholds. Both sparsity and tensor structures yield the computational barriers in high-order tensor clustering. The interplay between them results in significant differences between high-order tensor clustering and matrix clustering in literature in aspects of statistical and computational phase transition diagrams, algorithmic approaches, hardness conjecture, and proof techniques. To our best knowledge, we are the first to give a thorough characterization of the statistical and computational trade-off for such a double computational-barrier problem. Finally, we provide evidence for the computational hardness conjectures of HPC detection and HPDS recovery.

READ FULL TEXT
12/18/2020

Exact Clustering in Tensor Block Model: Statistical Optimality and Computational Limit

High-order clustering aims to identify heterogeneous substructure in mul...
09/12/2020

Open Problem: Average-Case Hardness of Hypergraphic Planted Clique Detection

We note the significance of hypergraphic planted clique (HPC) detection ...
02/06/2015

Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix

The interplay between computational efficiency and statistical accuracy ...
08/02/2018

Algorithmic thresholds for tensor PCA

We study the algorithmic thresholds for principal component analysis of ...
03/08/2017

Tensor SVD: Statistical and Computational Limits

In this paper, we propose a general framework for tensor singular value ...
01/26/2021

Computational phase transitions in sparse planted problems?

In recent times the cavity method, a statistical physics-inspired heuris...
03/22/2021

Mathematical Theory of Computational Resolution Limit in Multi-dimensions

Resolving a linear combination of point sources from their band-limited ...