A Unified Framework for Tuning Hyperparameters in Clustering Problems

by   Xinjie Fan, et al.

Selecting hyperparameters for unsupervised learning problems is difficult in general due to the lack of ground truth for validation. However, this issue is prevalent in machine learning, especially in clustering problems with examples including the Lagrange multipliers of penalty terms in semidefinite programming (SDP) relaxations and the bandwidths used for constructing kernel similarity matrices for Spectral Clustering. Despite this, there are not many provable algorithms for tuning these hyperparameters. In this paper, we provide a unified framework with provable guarantees for the above class of problems. We demonstrate our method on two distinct models. First, we show how to tune the hyperparameters in widely used SDP algorithms for community detection in networks. In this case, our method can also be used for model selection. Second, we show the same framework works for choosing the bandwidth for the kernel similarity matrix in Spectral Clustering for subgaussian mixtures under suitable model specification. In a variety of simulation experiments, we show that our framework outperforms other widely used tuning procedures in a broad range of parameter settings.


page 13

page 15


Randomized Spectral Clustering in Large-Scale Stochastic Block Models

Spectral clustering has been one of the widely used methods for communit...

Unified Spectral Clustering with Optimal Graph

Spectral clustering has found extensive use in many areas. Most traditio...

Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap

In this study, we propose a new spectral clustering framework that can a...

Twin Learning for Similarity and Clustering: A Unified Kernel Approach

Many similarity-based clustering methods work in two separate steps incl...

Learning Generative Models of Similarity Matrices

We describe a probabilistic (generative) view of affinity matrices along...

Kernel Spectral Clustering and applications

In this chapter we review the main literature related to kernel spectral...

Self-Tuning Spectral Clustering for Adaptive Tracking Areas Design in 5G Ultra-Dense Networks

In this paper, we address the issue of automatic tracking areas (TAs) pl...