Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification

03/21/2019
by   Leo L. Duan, et al.
0

High dimensional data often contain multiple facets, and several clustering patterns (views) can co-exist under different feature subspaces. While multi-view clustering algorithms were proposed, the uncertainty quantification remains difficult --- a particular challenge is in the high complexity of estimating the cluster assignment probability under each view, or/and to efficiently share information across views. In this article, we propose an empirical Bayes approach --- viewing the similarity matrices generated over subspaces as rough first-stage estimates for co-assignment probabilities, in its Kullback-Leibler neighborhood we obtain a refined low-rank soft cluster graph, formed by the pairwise product of simplex coordinates. Interestingly, each simplex coordinate directly encodes the cluster assignment uncertainty. For multi-view clustering, we equip each similarity matrix with a mixed membership over a small number of latent views, leading to effective dimension reduction. With a high model flexibility, the estimation can be succinctly re-parameterized as a continuous optimization problem, hence enjoys gradient-based computation. Theory establishes the connection of this model to random cluster graph under multiple views. Compared to single-view clustering approaches, substantially more interpretable results are obtained when clustering brains from human traumatic brain injury study, using high-dimensional gene expression data. KEY WORDS: Co-regularized Clustering, Consensus, PAC-Bayes, Random Cluster Graph, Variable Selection

READ FULL TEXT

page 7

page 9

page 16

research
01/30/2019

Feature Concatenation Multi-view Subspace Clustering

Many multi-view clustering methods have been proposed with the popularit...
research
11/26/2019

Multi-View Multiple Clusterings using Deep Matrix Factorization

Multi-view clustering aims at integrating complementary information from...
research
09/05/2017

Multi-View Spectral Clustering via Structured Low-Rank Matrix Factorization

Multi-view data clustering attracts more attention than their single vie...
research
08/29/2017

Multi-view Low-rank Sparse Subspace Clustering

Most existing approaches address multi-view subspace clustering problem ...
research
02/02/2012

Multi-view predictive partitioning in high dimensions

Many modern data mining applications are concerned with the analysis of ...
research
08/04/2017

Beyond Low-Rank Representations: Orthogonal Clustering Basis Reconstruction with Optimized Graph Structure for Multi-view Spectral Clustering

Low-Rank Representation (LRR) is arguably one of the most powerful parad...
research
06/02/2020

l_1-ball Prior: Uncertainty Quantification with Exact Zeros

Lasso and l_1-regularization play a dominating role in high dimensional ...

Please sign up or login with your details

Forgot password? Click here to reset