Graph quilting: graphical model selection from partially observed covariances
We investigate the problem of conditional dependence graph estimation when several pairs of nodes have no joint observation. For these pairs even the simplest metric of covariability, the sample covariance, is unavailable. This problem arises, for instance, in calcium imaging recordings where the activities of a large population of neurons are typically observed by recording from smaller subsets of cells at once, and several pairs of cells are never recorded simultaneously. With no additional assumption, the unavailability of parts of the covariance matrix translates into the unidentifiability of the precision matrix that, in the Gaussian graphical model setting, specifies the graph. Recovering a conditional dependence graph in such settings is fundamentally an extremely hard challenge, because it requires to infer conditional dependences between network nodes with no empirical evidence of their covariability. We call this challenge the "graph quilting problem". We demonstrate that, under mild conditions, it is possible to correctly identify not only the edges connecting the observed pairs of nodes, but also a superset of those connecting the variables that are never observed jointly. We propose an ℓ_1 regularized graph estimator based on a partially observed sample covariance matrix and establish its rates of convergence in high-dimensions. We finally present a simulation study and the analysis of calcium imaging data of ten thousand neurons in mouse visual cortex.
READ FULL TEXT