The Mathematics Behind Spectral Clustering And The Equivalence To PCA
Spectral clustering is a popular algorithm that clusters points using the eigenvalues and eigenvectors of Laplacian matrices derived from the data. For years, spectral clustering has been working mysteriously. This paper explains spectral clustering by dividing it into two categories based on whether the graph Laplacian is fully connected or not. For a fully connected graph, this paper demonstrates the dimension reduction part by offering an objective function: the covariance between the original data points' similarities and the mapped data points' similarities. For a multi-connected graph, this paper proves that with a proper k, the first k eigenvectors are the indicators of the connected components. This paper also proves there is an equivalence between spectral embedding and PCA.
READ FULL TEXT