Splitting Methods for Convex Bi-Clustering and Co-Clustering
Co-Clustering, the problem of simultaneously identifying clusters across multiple aspects of a data set, is a natural generalization of clustering to higher-order structured data. Recent convex formulations of bi-clustering and tensor co-clustering, which shrink estimated centroids together using a convex fusion penalty, allow for global optimality guarantees and precise theoretical analysis, but their computational properties have been less well studied. In this note, we present three efficient operator-splitting methods for the convex co-clustering problem: a standard two-block ADMM, a Generalized ADMM which avoids an expensive tensor Sylvester equation in the primal update, and a three-block ADMM based on the operator splitting scheme of Davis and Yin. Theoretical complexity analysis suggests, and experimental evidence confirms, that the Generalized ADMM is far more efficient for large problems.
READ FULL TEXT