Filtrated Algebraic Subspace Clustering
Subspace clustering is the problem of clustering data that lie close to a union of linear subspaces. In the abstract form of the problem, where no noise or other corruptions are present, the data are assumed to lie in general position inside the algebraic variety of a union of subspaces, and the objective is to decompose the variety into its constituent subspaces. Prior algebraic-geometric approaches to this problem require the subspaces to be of equal dimension, or the number of subspaces to be known. Subspaces of arbitrary dimensions can still be recovered in closed form, in terms of all homogeneous polynomials of degree m that vanish on their union, when an upper bound m on the number of the subspaces is given. In this paper, we propose an alternative, provably correct, algorithm for addressing a union of at most m arbitrary-dimensional subspaces, based on the idea of descending filtrations of subspace arrangements. Our algorithm uses the gradient of a vanishing polynomial at a point in the variety to find a hyperplane containing the subspace S passing through that point. By intersecting the variety with this hyperplane, we obtain a subvariety that contains S, and recursively applying the procedure until no non-trivial vanishing polynomial exists, our algorithm eventually identifies S. By repeating this procedure for other points, our algorithm eventually identifies all the subspaces by returning a basis for their orthogonal complement. Finally, we develop a variant of the abstract algorithm, suitable for computations with noisy data. We show by experiments on synthetic and real data that the proposed algorithm outperforms state-of-the-art methods on several occasions, thus demonstrating the merit of the idea of filtrations.
READ FULL TEXT