Stay on path: PCA along graph paths

06/08/2015
by   Megasthenis Asteris, et al.
0

We introduce a variant of (sparse) PCA in which the set of feasible support sets is determined by a graph. In particular, we consider the following setting: given a directed acyclic graph G on p vertices corresponding to variables, the non-zero entries of the extracted principal component must coincide with vertices lying along a path in G. From a statistical perspective, information on the underlying network may potentially reduce the number of observations required to recover the population principal component. We consider the canonical estimator which optimally exploits the prior knowledge by solving a non-convex quadratic maximization on the empirical covariance. We introduce a simple network and analyze the estimator under the spiked covariance model. We show that side information potentially improves the statistical complexity. We propose two algorithms to approximate the solution of the constrained quadratic maximization, and recover a component with the desired properties. We empirically evaluate our schemes on synthetic and real datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2008

Decomposable Principal Component Analysis

We consider principal component analysis (PCA) in decomposable Gaussian ...
research
12/30/2018

Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Robust Principal Component Analysis

This work is concerned with the non-negative robust principal component ...
research
09/23/2021

Sparse PCA: A New Scalable Estimator Based On Integer Programming

We consider the Sparse Principal Component Analysis (SPCA) problem under...
research
05/25/2021

Principal Component Hierarchy for Sparse Quadratic Programs

We propose a novel approximation hierarchy for cardinality-constrained, ...
research
10/01/2018

Integrated Principal Components Analysis

Data integration, or the strategic analysis of multiple sources of data ...
research
08/28/2018

Convergence of Krasulina Scheme

Principal component analysis (PCA) is one of the most commonly used stat...
research
11/11/2021

Winning Solution of the AIcrowd SBB Flatland Challenge 2019-2020

This report describes the main ideas of the solution which won the AIcro...

Please sign up or login with your details

Forgot password? Click here to reset