
Subspace Clustering using Ensembles of KSubspaces
We present a novel approach to the subspace clustering problem that leverages ensembles of the Ksubspaces (KSS) algorithm via the evidence accumulation clustering framework. Our algorithm forms a coassociation matrix whose (i,j)th entry is the number of times points i and j are clustered together by several runs of KSS with random initializations. We analyze the entries of this coassociation matrix and show that a naive version of our algorithm can recover subspaces for points drawn from the same conditions as the Thresholded Subspace Clustering algorithm. We show on synthetic data that our method performs well under subspaces with large intersection, subspaces with small principal angles, and noisy data. Finally, we provide a variant of our algorithm that achieves stateoftheart performance across several benchmark datasets, including a resulting error for the COIL20 database that is less than half that achieved by existing algorithms.
09/14/2017 ∙ by John Lipor, et al. ∙ 0 ∙ shareread it

Convergence of a Grassmannian Gradient Descent Algorithm for Subspace Estimation From Undersampled Data
Subspace learning and matrix factorization problems have a great many applications in science and engineering, and efficient algorithms are critical as dataset sizes continue to grow. Many relevant problem formulations are nonconvex, and in a variety of contexts it has been observed that solving the nonconvex problem directly is not only efficient but reliably accurate. We discuss convergence theory for a particular method: first order incremental gradient descent constrained to the Grassmannian. The output of the algorithm is an orthonormal basis for a ddimensional subspace spanned by an input streaming data matrix. We study two sampling cases: where each data vector of the streaming matrix is fully sampled, or where it is undersampled by a sampling matrix A_t∈^m× n with m≪ n. We propose an adaptive stepsize scheme that depends only on the sampled data and algorithm outputs. We prove that with fully sampled data, the stepsize scheme maximizes the improvement of our convergence metric at each iteration, and this method converges from any random initialization to the true subspace, despite the nonconvex formulation and orthogonality constraints. For the case of undersampled data, we establish monotonic improvement on the defined convergence metric for each iteration with high probability.
10/01/2016 ∙ by Dejiao Zhang, et al. ∙ 0 ∙ shareread it

Deep Unsupervised Clustering Using Mixture of Autoencoders
Unsupervised clustering is one of the most fundamental challenges in machine learning. A popular hypothesis is that data are generated from a union of lowdimensional nonlinear manifolds; thus an approach to clustering is identifying and separating these manifolds. In this paper, we present a novel approach to solve this problem by using a mixture of autoencoders. Our model consists of two parts: 1) a collection of autoencoders where each autoencoder learns the underlying manifold of a group of similar objects, and 2) a mixture assignment neural network, which takes the concatenated latent vectors from the autoencoders as input and infers the distribution over clusters. By jointly optimizing the two parts, we simultaneously assign data to clusters and learn the underlying manifolds of each cluster.
12/21/2017 ∙ by Dejiao Zhang, et al. ∙ 0 ∙ shareread it

Global Convergence of a Grassmannian Gradient Descent Algorithm for Subspace Estimation
It has been observed in a variety of contexts that gradient descent methods have great success in solving lowrank matrix factorization problems, despite the relevant problem formulation being nonconvex. We tackle a particular instance of this scenario, where we seek the ddimensional subspace spanned by a streaming data matrix. We apply the natural first order incremental gradient descent method, constraining the gradient method to the Grassmannian. In this paper, we propose an adaptive step size scheme that is greedy for the noiseless case, that maximizes the improvement of our metric of convergence at each data index t, and yields an expected improvement for the noisy case. We show that, with noisefree data, this method converges from any random initialization to the global minimum of the problem. For noisy data, we provide the expected convergence rate of the proposed algorithm per iteration.
06/24/2015 ∙ by Dejiao Zhang, et al. ∙ 0 ∙ shareread it

Iterative Grassmannian Optimization for Robust Image Alignment
Robust highdimensional data processing has witnessed an exciting development in recent years, as theoretical results have shown that it is possible using convex programming to optimize data fit to a lowrank component plus a sparse outlier component. This problem is also known as Robust PCA, and it has found application in many areas of computer vision. In image and video processing and face recognition, the opportunity to process massive image databases is emerging as people upload photo and video data online in unprecedented volumes. However, data quality and consistency is not controlled in any way, and the massiveness of the data poses a serious computational challenge. In this paper we present tGRASTA, or "Transformed GRASTA (Grassmannian Robust Adaptive Subspace Tracking Algorithm)". tGRASTA iteratively performs incremental gradient descent constrained to the Grassmann manifold of subspaces in order to simultaneously estimate a decomposition of a collection of images into a lowrank subspace, a sparse part of occlusions and foreground objects, and a transformation such as rotation or translation of the image. We show that tGRASTA is 4 × faster than stateoftheart algorithms, has half the memory requirement, and can achieve alignment for face images as well as jittered camera surveillance images.
06/03/2013 ∙ by Jun He, et al. ∙ 0 ∙ shareread it
Dejiao Zhang
is this you? claim profile