Technical Report: Graph-Structured Sparse Optimization for Connected Subgraph Detection
Structured sparse optimization is an important and challenging problem for analyzing high-dimensional data in a variety of applications such as bioinformatics, medical imaging, social networks, and astronomy. Although a number of structured sparsity models have been explored, such as trees, groups, clusters, and paths, connected subgraphs have been rarely explored in the current literature. One of the main technical challenges is that there is no structured sparsity-inducing norm that can directly model the space of connected subgraphs, and there is no exact implementation of a projection oracle for connected subgraphs due to its NP-hardness. In this paper, we explore efficient approximate projection oracles for connected subgraphs, and propose two new efficient algorithms, namely, Graph-IHT and Graph-GHTP, to optimize a generic nonlinear objective function subject to connectivity constraint on the support of the variables. Our proposed algorithms enjoy strong guarantees analogous to several current methods for sparsity-constrained optimization, such as Projected Gradient Descent (PGD), Approximate Model Iterative Hard Thresholding (AM-IHT), and Gradient Hard Thresholding Pursuit (GHTP) with respect to convergence rate and approximation accuracy. We apply our proposed algorithms to optimize several well-known graph scan statistics in several applications of connected subgraph detection as a case study, and the experimental results demonstrate that our proposed algorithms outperform state-of-the-art methods.
READ FULL TEXT