Optimal Clustering by Lloyd Algorithm for Low-Rank Mixture Model

07/11/2022
by   Zhongyuan Lyu, et al.
0

This paper investigates the computational and statistical limits in clustering matrix-valued observations. We propose a low-rank mixture model (LrMM), adapted from the classical Gaussian mixture model (GMM) to treat matrix-valued observations, which assumes low-rankness for population center matrices. A computationally efficient clustering method is designed by integrating Lloyd algorithm and low-rank approximation. Once well-initialized, the algorithm converges fast and achieves an exponential-type clustering error rate that is minimax optimal. Meanwhile, we show that a tensor-based spectral method delivers a good initial clustering. Comparable to GMM, the minimax optimal clustering error rate is decided by the separation strength, i.e, the minimal distance between population center matrices. By exploiting low-rankness, the proposed algorithm is blessed with a weaker requirement on separation strength. Unlike GMM, however, the statistical and computational difficulty of LrMM is characterized by the signal strength, i.e, the smallest non-zero singular values of population center matrices. Evidences are provided showing that no polynomial-time algorithm is consistent if the signal strength is not strong enough, even though the separation strength is strong. The performance of our low-rank Lloyd algorithm is further demonstrated under sub-Gaussian noise. Intriguing differences between estimation and clustering under LrMM are discussed. The merits of low-rank Lloyd algorithm are confirmed by comprehensive simulation experiments. Finally, our method outperforms others in the literature on real-world datasets.

READ FULL TEXT
research
01/22/2022

Optimal Estimation and Computational Limit of Low-rank Gaussian Mixtures

Structural matrix-variate observations routinely arise in diverse fields...
research
05/03/2022

Robust low-rank tensor regression via truncation and adaptive Huber loss

This paper investigates robust low-rank tensor regression with only fini...
research
06/12/2018

Phase transitions in spiked matrix estimation: information-theoretic analysis

We study here the so-called spiked Wigner and Wishart models, where one ...
research
11/01/2022

Fundamental Limits of Low-Rank Matrix Estimation with Diverging Aspect Ratios

We consider the problem of estimating the factors of a low-rank n × d ma...
research
12/31/2018

Learning Mixture Model with Missing Values and its Application to Rankings

We consider the question of learning mixtures of generic sub-gaussian di...
research
07/28/2022

Online Inference for Mixture Model of Streaming Graph Signals with Non-White Excitation

This paper considers a joint multi-graph inference and clustering proble...
research
08/06/2018

Regularized matrix data clustering and its application to image analysis

In this paper, we propose a regularized mixture probabilistic model to c...

Please sign up or login with your details

Forgot password? Click here to reset