Group Invariance and Computational Sufficiency
Statistical sufficiency formalizes the notion of data reduction. In the decision theoretic interpretation, once a model is chosen all inferences should be based on a sufficient statistic. However, suppose we start with a set of procedures rather than a specific model. Is it possible to reduce the data and yet still be able to compute all of the procedures? In other words, what functions of the data contain all of the information sufficient for computing these procedures? This article presents some progress towards a theory of "computational sufficiency" and shows that strong reductions can be made for large classes of penalized M-estimators by exploiting hidden symmetries in the underlying optimization problems. These reductions can (1) reveal hidden connections between seemingly disparate methods, (2) enable efficient computation, (3) give a different perspective on understanding procedures in a model-free setting. As a main example, the theory provides a surprising answer to the following question: "What do the Graphical Lasso, sparse PCA, single-linkage clustering, and L1 penalized Ising model selection all have in common?"
READ FULL TEXT